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ABSTRACT 

The growth of indexing services has emphasized the 
need for more knowledge of the indexing process itself. Consistency 
is necessary for continuing progress in the field. This study 
postulates tha^ ; (1) definitions of indexer consistency should 

consist of the indexer's perception of indexable concepts and his 
choice of terminology; (2) both parts of the definition can be 
measured separately; (3) there will be a large difference in the 
degree of each; arid (4) indexer consistency scores should contain 
both elements. For the study, five indexers read 550 journal articles 
and labeled the concepts discussed in each article. Findings from 
this exercise indicate a need for a re-examination of the problem of 
indexer consistency and its relation to: (1) tests of the 

effectiveness and efficiency of indexing languages and systems; (2) 
index tools and methodology; (3) index research, much of which has 
concentrated on terminological relationships to the neglect of 
concept-related problems; and (4) indexer consistency as a factor in 
indexer-user consistency in choice of concepts or terms for the 
retrieval of indexed information. (Author/SJ) 
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INDEXER CONS IS 'JEN CY IN PERCEPTION OF CONCEPTS 
AND IN CHOICE OF TERMINOLOGY 

Barbara Meitin Preschel 

The growth of indexing services and of the need 
for indexes has emphasized the need for more knowledge of 
the indexing process itself. Indexing cannot become more 
scientific until the process is better understood and the 
products of individual indexing systems are more consis- 
tent. Consistency is necessary* even if not sufficient* 
for continuing progress in the field. 

Previous studies of indexer consistency have de- 
fined it as the degree of replication in the index terns 
chosen independently by two or more indexers* or by the 
same indexer at different times* to label the information- 
al content of a given text as a means of providing access 
to the information in the text. Indexer consistency 
scores have been primarily a measure of the degree of 
replication in the index terms so chosen. 

/ 

' This approach has resulted in measures that com- 

mingle* in an undifferentiated manner* indexer consistency 
in the two parts of the indexing process: 
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1. Indexer perception of indexable concepts ; 

2, Indexer choice of terminology with which to label the 
concepts perceived,. 

This study postulates: 

1. That definitions of indexer consistency should state 
that it consists of indexer consistency in each of the two 
parts cf the indexing process listed above; 

2. That these parts can be measured separately; 

3. That there will be a gross difference in the degree of 
each; 

4 . That indexer consistency scores should be determined by 
a planned use of both measurements. 

For the purposes of the study,, copies of 550 journal 
articles were separated into 22 packets of 25 articles each. 
All the articles in each packet were read by each of five 
indexers who were instructed to identify and label the con- 
cepts discussed in each article. 

When the analysis of a given packet had been comple- 
ted by the indexers assigned to it, concept categories were 
established for each article based on the concepts perceived 
by the indexers. 

The labels created by each indexer for each article 
were then examined to discover which concept categories, of 
all the concept categories established by all the indexers 
for that article, were included in the labels an individual 
indexer had created for that article. 
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Each indexer was then paired successively with every 

other indexer for the article and a mean inter-indexer con- 
cept consistency score for all pairs for each article was 
established . 

The terminology of each of the labels created by each 
pair of indexers for each article was then compared and a 
mean inter-indexer terminology consistency score for all 
pairs for each article was established. 

For each of the articles in the study, the mean inter- 
indexer consistency in identification of concepts score was 
significantly higher than the mean inter-indexer consistency 
in choice of terminology score. In 500 of the 550 articles, 
it was 21.0 percentage points or more higher. Scores of 
mean inter-indexer consistency in choice of terminology ranged 
from 0.0$ to 30.0$. Scores of mean inter-indexer consistency 
in the perception of concepts ranged from 9 to 84.0$. The 
statistical findings of the study revealed a pattern in which 
the mean terminology consistency scores clustered at the low 
end of the indexer consistency percentile range and the mean 
concept consistency scores clustered at the middle or upper 
end . 

These findings indicate a need for a re-examination 
of the problem of indexer consistency and its relation to; 

i 

1. Tests of the effectiveness and efficiency of indexing 
languages and systems, since the findings of these tests 
would undoubtedly be affected if indexer consistency in 
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perception of indexable matter was overtly one of the vari- 
ables studied; 

2. Index tools and methodology, in particular instructions 
tc indexers on the construction and use of thesauri and 
instructions on what kinds of concepts are indexable con- 
cepts; 

3. Index research, much of which has concentrated on ter- 
minological relationships, to the neglect of concept-related 
problems; 

4. Indexer predictability (consistency) as a factor in 
indexer-user consistency in choice of concepts or terms for 
the retrieval of indexed information. 
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CHAPTER I 
INTRODUCTION 
The Problem 

Indexing, and an understanding of indexing proce- 
dures, is basic to into mat ion flow. This study is concerned 
with an elemental aspect of indexing methodology: the iden- 
tification of indexable matter and its expression for pur- 
poses of communication. It is concerned with the definition 
of the term "indexer consistency” and with the use of this 
definition in establishing quantitative measurements of 
indexer consistency. 

Previous studies have defined indexer' consistency as 
the degree of replication in the index terns chosen independ- 
ently by two or more indexers, or by the same indexer at 
different times, to label the content of a given text as a 
means of providing access to the information in the text. 
These studies will be discussed in detail in Chapter II. 

This study postulates that: 

1. The process of indexing has two parts 

A. Indexer perception of indexable matter (indexable 

concepts) in the texts to be indexed; and 

B. Indexer characterization of the perceived indexable 




matter in words; 
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2. Indexing is an order-dependent technique in that a con- 
cept must he perceived before it can he expressed in an 
index term; 

3. Perception of concepts Is a process distinct from the 

process of choosing terms with which to characterize the 
concepts perceived; 

4. There may be more than one indexing term that will accu- 
rately characterize a given concept. 

It therefore postulates that indexer consistency 
should be defined as having two parts: 

1. Indexer consistency in the perception of indexable matter; 

2. Indexer consistency in the choice of term with which to 
label the indexable matter perceived. 

The Hypothesis 

The hypothesis to be tested was that the degree of 
indexer consistency in the perception of indexable matter 
can be measured separately from and will be different in 
extent from the degree of indexer consistency in the termi- 
nology chosen to characterize that indexable matter. 

Background of the Problem 
The process by which subject indexers choose the 
index entries or verbal labels that will facilitate the lo- 
cation of information bearing material has been described as 
follows . 

It is convenient to think of subject indexing as a 
two-step operation: 

1. Deciding what a document is about (i.e. its subject 
matter) ; 
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2. Translating this conceptual analysis into index 
terms which act as shorthand symbols,, or labels. Tor the 
subject matter of the document.! 

Indexing can be regarded as a two-part process. 

First, it is necessary to decide what are the essential 
ideas of a document that have to be recorded to describe 
it. Second, this essence of the document has to be re- 
corded in a standard way. 2 

Charles L. Bernier divides his analysis of the 
subject indexing process into four parts: 

Apparently, a subject indexer does four things so 
rapidly and smoothly that even he may be unaware of this 
detail. First , he selects subjects suitable for indexing 
according to the policy and rules of the organization 
for which he works. Second, he paraphrases the subject. 
The paraphrase is the verbal embodiment of the subject 
which at the time of selection may not exist in the form 
of words in the mind of the indexer. Third, he provides 
guides to his paraphrases of the subject. These guides 
are statements (embryonic index entries) starting with 
the word or term that seems most closely associated with 
the subject and followed by an expression that makes the 
word or term sufficiently specific to enable the reader 
to decide whether or not he needs to consult the refer- 
ence from the entry. Fourth , he translates these guides 
into standard index terminology so as to avoid the bans 
of all poor indexes — scattering of like information.-? 

It can be seen that part ± of Bernier 1 s analysis corresponds 
to the first part of Lancaster 1 s and Shaw and Rothman’s anal- 
yses, and Bernier’s parts 2,3j and 4, correspond to me sec- 
ond part of the other analyses quoted above. 

These investigators make the same distinction between 
a concept and the term used to characterize, name, or label 



lF. Wilfred Lancaster, Information Retrieval Systems 
(New York: John Wiley and Sons, Inc., 196 b )7pl 3. 

2t. N. Shaw and H. Rothman, "An Experiment in In- 
dexing by Word-Choosing," Journal of Documentation XXIV 
(September 1968): 159* 

3 Charles L. Bernier, "Indexing and Thesauri," Spe- 
cial Libraries, LIX (February 1968): 99. 
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the concept as do such semanticists as Korzybski, Ogden, 
Richards, Ullman, Hayakawa, and Nida. 

This semantic distinction between a concept and the 

term used to label the concept may be thought of as the basis 

for the division of the indexing process into two parts . 

Objectives of this Study 

The purpose of this study is to demonstrate that even 
though two or more readers of a given text may have identi- 
fied the same concepts in the text, they may express the con- 
cepts in differing terminology; and therefore indexer con- 
sistency studies that use consistency in choice of terminolo- 
gy as their only apparent criterion in determining degree of 
consistency are unconsciously presenting a measure that com- 
mingles the two kinds of consistency. This is not to say that 
the directors of these studies were unaware of the difference 
between a concept and the term used to symbolize it, but that 
they did not consciously distinguish between them in their 
definitions and measurements. The measurements they spoke of 
as being based on degree of match in terminology also included 
indexer consistency in degree of perception of concept, but 
they did not overtly distinguish one from the other. 

This study is designed to show that there is a signi- 
ficant difference in the degree of indexer consistency in 
perception of indexable matter (concepts) and the degree of 
indexer consistency in choice of terminology with which to 
desc ribe that indexable matter; that this difference in de- 
gree will be large enough to be of importance in the invest- 
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igation, evaluation, and construction of indexing systems; 
that each of these types of indexer consistency should be 
separately identified and included in the determination of an 
overall measurement of indexer consistency; and that this 
ability to investigate the two facets of indexer consistency 
separately may lead to improvement in indexing techniques 
and tools, and increased consistency (predictability) in 
both indexer choice of indexable matter and indexer choice 
of terminology. 
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Presentation of Study 

Chapter II is devoted to an examination of previous 
studies of indexer consistency. They are examined as a group, 
reviews of indexer consistency studies are discussed, and 
certain individual investigations of indexer consistency 
which have particular meaning for this study are reported on 
in detail. 

Chapter III describes the methodology used in this 
study. The procedure used in choosing the textual material 
that was analyzed, the characteristics and training of the 
people employed as indexers, the data analysis procedures, 
and the mathematical formulas and methods used in determining 
the stated indexer consistency scores are explained. 

Chapter IV discusses the concept categorization proc- 
ess. The process is explained, and examples illustrating the 
process and the problems encountered are given. 

Chapter V discusses the findings of the study in terms 
of the results of the statistical methods used. Statistics 
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for various aspects of the study are displayed and discussed. 

Chapter VI presents a summary of the investigation 
and conclusions drawn from the findings 5 and a discussion of 

some of the implications of the study. 



CHAPTER II 



PREVIOUS STUDIES OF INDEXER CONSISTENCY 

General Discussion of Previous Studies 
of Indexer Consistency 

The library and information science communities car- 
ried out a number of formal studies of indexer consistency 
in the early 1960's. A list of indexer consistency studies 
since I960 will be found in Appendix A. In this chapter, 

these studies will first be considered as a group. A num- 
ber of them will then be discussed individually. Special 
features or aspects of the studies will be discussed, but the 
primary reason for considering them here is to demonstrate 
that they define indexer consistency as the degree of repli- 
cation or match in the terminology chosen to characterize the 
informational content of the texts. 

Although all of the studies use degree of replication 
or match in terminology as the criterion of degree of indexer 
consistency, some define a match in terminology more liberally 
than others. For example, some of the studies consider the 
singular and the plural forms of a given word as an "exact" 
match, some do not. 

Some of the reports discuss concepts as entities 
separate from the terms used to label them, but in their 



analyses and measurement of indexer consistency, they have 
all used the degree of match of the terms finally selected 
as the deciding factor in determining degree of consistency. 
Essentially, this procedure presents a combined measure of 
consistency in concept identification and consistency in 
its expression. 

In many cases, primarily those testing indexer con- 
sistency within or between actual working indexing systems, 
lists of terms we re supplied to the indexers so that they 
could choose terms from the list. 

In these cases, degree of match in terminology was 
also a function of the precision with which the terms on the 
list were defined or understood by the indexer and the 
degree of overlap in the meaning of individual terms. 

This was not always indicated in these studies. Tinker's 
studies, which will be discussed later in this chapter, are 
actually concerned with the measurement of the degree to 
which indexers understand the precise meaning of terms from 
a list, as this understanding is reflected in consistency 
in choice of terminology. 

In the studies in which lists of authorized terms 
were given to the indexers, this kind of vocabulary control 
undoubtedly exerted an influence on the final indexer con- 
sistency scores. The extent of this influence, or even 
the kind of influence exerted by lists of authorized terms, 
is not a variable examined in the study reported on here. 
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In tests of indexer consistency where no pre- 
established lists of terms were provided* the emphasis was 
usually p3 aced on such variables as the size of the texts 
indexed* the depth of indexing* the conditions under which 
the indexing was done, or the type of training or indexing 
aids provided the indexers. 

In all of these studies* textual material (abstracts* 
titles* full articles* patents* sentences) was indexed more 
than once and the consistency with which terms defined as 
matching terms were chosen to characterize the informational 
content was computed for each indexing of the text. 

The findings of these studies are not statistically 
comparable and so cannot be used for comparison judgements. 
Such factors as testing conditions* measures of consistency* 
experience and education of the indexers* indexing aids* 
depth of indexing required* size of universe- indexed* type 
and size of text indexed* indexing system and terminology* 
subject area* and stated objectives of the studies are quite 
disparate. There is great disparity in the studies' defi- 
nitions of what they consider a "match” in terminology. 

In some studies* there is no definition as to what consti- 
tuted consistency of terminology. In some studies* it was 
defined ambiguously. In some studies, distinctions were 
made between consistency in the choice of "significant 
terms" and consistency in the choice of "peripheral terms". 
(The Zunde and Dexter study which 13 discussed later in this 
chapter is an example of this. ) In some cases the 



statistical methodology used was not stated. 

Re nor t s on Studies of Indexer 

a .. ■ . ■ - — • - — — — ■ * 

Consistency 

Two studies of indexer consistency have attempted to 

gather and compare other studies. 

St. Laurent Review 

The review of the literature of indexer consistency 
done by Mary Cuddy St. Laurent as a Master’s thesis at the 
University of Chicago Graduate Library School in 1966 dis- 
cusses and evaluates reported work up to that time. She 
reaches the conclusion that, "The studies that have been 
made of indexer consistency ... do not allow any actual 
comparison of the results they contain ." 1 She blames this 
on the over-all design of the studies, the lack of defini- 
tion of variables, and the disparity in the measures used 
to compute indexer consistenc.v . She does not specifically 
discuss the fact that all of the studies define ’ indexer 
consistency" as consistency in final choice of terminology, 
but in her introduction, she states that 

Consistency refers to the amount of agreement on 
the number of terms considered sufficient to represent 
the significant concepts of a document and to the pro- 
portion of matched terms among indexers. 



X Mary Cuddy St. Laurent, A Review of the L iterature 
of Indexer Consistency (Chicago: University of Chicago 
Graduate Library School, 1966), p. 26. 

2 Ibid., p. 7. 
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The use of the phrase "amount of agreement on the 
number of terms" may be thought of as an unconscious attempt 
to discover how many indexable concepts each indexer per- 
ceived. If each term is assumed to label one indexable 
concept, and one indexer uses five terms for a given text, 
while another uses ten terms, it would mean that the fjrst 
indexer perceived half the number of indexable concepts that 
the second indexer perceived. None of the studies discuss 
"number of terms assigned" as indexer perception of index- 
able matter, however. 

The use of the phrase "proportion of matched terms" 
indicates that, as in other studies of indexer consistency, 
St. Laurent thought of "indexer consistency" primarily as a 
measurement of the degree of match in terminology. 

Hooper Study 

In his study of indexer consistency studies, R. S. 
Hooper reviewed 17 reports of indexer consistency tests, 
concent rating his attention on their method of measuring 
indexer consistency. 3 He states: 

There is no standard measure of consistency. 

Reports v/hich quote indexer consistency values often 
do not state how the values were computed. There- 
fore, we shall define and express mathematically the 
consistency measures which we derive essentially from 
the information reviewed in the seventeen reports. 

Where raw data was given, in any of the seventeen 
reports, consistency values were re-computed in terms 



^R. S. Hooper, Indexer Consistency Test s - Origin, 
Measurements. Results and Utilization (Bethesda, Md. : 

IBM Corporation, 1965)* 
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of one of these measures. In other reports, the 
author's value is reported and suffixed with a "CX" 
to indicate that the exact meaning of the consistency 

value cannot be interpreted from information within 
the report. ^ 

Hooper was actually able to recompute consistency scores for 
only six of the tests by using raw data available in reports 
of the tests with equations he developed for the purpose. 

Hooper does not give a formal verbal definition of 
"indexer consistency" but does give the equations he uses to 
arrive at his measure of it. These equations are based on 



terminology : 



The consistency of a pair (CP) . . . , that is, 
the consistency of one indexer with respect to a second 
is based on the number of times the two indexers agree 
on the u re of a term, divided by the total number of 
terms used by either indexer (based on the specific 
document) . 






100A 



A + M + N 



where, A -- the number of term agreements between * M 1 
and ’ N 1 for a specific document 
M = the number of terms used by 'M' but not 
used by 'N 1 

j\f = the number of terms used by 'N' but not 
used by 1 M * . 



The consistency of an individual with respect to a group 
(CG), that is, the consistency of any one indexer with 
respect to all other indexers (assuming more than two 
indexers exist) may be computed by finding the mean of 
all pair consistency (CP) values between the one indexer 
and all other indexers (who have indexed the same docu- 
ment) . 



CP 



CG-l = 



12 



+ CP 



13 



+ 



+ CP 



In 



n - 1 



where, CF, 0 is the consistency (CP) between indexer 1 

and 2 

CPj^ ... etc. 



^Ibid. , p. 3« 



n is the number of indexers. 5 
Hooper states that: 

Inconsistencies may result from a disagreement as 
to the number of index terms which should be used to 
represent a given document, or from a disagreement 
among indexers as to which specific index term should 
be used to represent a specific theme or concept with- 
in a document. 6 

In other words, Hooper sees two variables affecting 
indexer consistency. 

The first is the number of index terms assigned by 
each indexer to a given article. This is what Harris, 

Rayward, and Svenonius, in a study done under Swanson's 

direction, which is discussed later in this chapter, use as 

their definition of indexing depth. Hooper does not define 

number of indexing terms assigned as "indexing depth". He 7 

does not actually define "indexing depth". However, he 

states that he equates depth of indexing with choice of 

V 

indexable matter. "The problem of depth of indexing is -u 

simply the problem of deciding which concepts or themes with- 
in a document are worth indexing. 

Depth of indexing and perception of indexable matter 
are not synonymous, as Hooper states. Depth of indexing, 
if it is defined as number of index terms assigned, may be a 
function of perception of indexable matter, but it is just as 
likely to be a function of the rules of the indexing system 




5 Ibid. , p. 3-4. 

6 Ibid. , p. 2. 

7 Ibid., p. 10. 
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within which i;he indexing is being done. For instance, if 
an indexer's instructions are to assign a maximum of five 
index terms to a particular text, the terms he chooses will 
represent different concepts, often concepts of a higher 
generic level, than if his instructions are to assign a 
minimum of eight and a maximum of twenty index terms to the 
same text. In the first instance, he might assign a term 
like "fish"; in the second he might assign a term like 
"fish", but also several terms like "mackerel" and "trout". 

In indexing systems in which a certain number of 
terms are prescribed for each item indexed, an indexer is 
forced to re -adjust his personal decisions as to appropriate 
indexing depth with differences in the length of the texts 
he indexes. If he is asked to use five index terms per 
item, and one item is one page long, while another is twenty 
pages long, the breadth or narrowness of the concepts he 
chooses as indexable may vary since he may be forced to 
choose broader concepts for the long item and narrower con- 
cepts for the short item to arrive at the designated number 
of terms for each item. 

The type of index terms allowable in the information 
system in which the indexer works may affect the number of 
terms he assigns. In a pre -coordinate index system, an 
authorized index term might be: "Probationers, psychological 

tests,. " One index term would be used. In a post -coordinate 
index system, the same information might require two index 



terms: "Probationers 1 ' and "Psychological tests". 

Indexer's instructions are, of course, not limited 
only to the number of terms he should assign to a given item. 
They also may instruct him to index only the main topic (s) 
of the item when taken as a whole (H. W. Wilson), or to 
index only new material (Chemical Abstracts). These kinds 
of instructions, and others not mentioned here, may affect 
the kinds of concepts an indexer perceives as indexable as 
well as the breadth, specificity, or number of the concepts 
he perceives as indexable. 

The second variable that Hooper says affects indexer 
consistency is a disagreement among the indexers as to which 
specific index term should be used to represent a specific 
theme or concept within a document. 

This applies directly to the problem investigated in 
the present study. How great an effect does indexers' dis- 
agreement as to which index term should be assigned to a 
particular concept have on measurements of over-all indexer 
consistency? 

Hooper also states that his review of indexer con- 
sistency studies showed that, "There was a large disagreement 
among indexers as to what information within a document 

Q 

should be indexed. "° 

This statement also applies directly to the problem 
investigated in the present study. Granted that there is 

8 Ibid. 





disagreement among the indexers "as to what information with- 
in a document should be indexed", how large is the degree of 
disagreement and is it significantly less than their degree' 
of disagreement in choice of terminology? 

Despite his statements about indexer perception of 
indexable matter and indexer disagreement in choice of term 
with which to describe a given perceived concept. Hooper 
used agreement in use of terminology as his only stated 
measure of indexer consistency. 9 

For the indexer consistency studies Hooper reviewed 
in which the degree of inter- or intra- indexer consistency 
was expressed as a percentage, the indexer consistency scores 
were as follows. 

The scores for studies a, d, e, f, g, and o represent 
scores Hooper derived using his own formulas on raw material 
found in the reports of the studies. In each case, the 
score Hooper got from his recomputation was the same as or 
lower than the score originally reported by the director of 
the study. The scores for studies b, c, i, k, 1, m, n, and 
q represent scores given in the original reports of the stu- 
dies. 

The range of the scores seems to indicate either: 

1. That there is an enormous range in indexer consistency, or 

2 . That there is a lack of agreement on what the variable 
"indexer consistency" actually consists of and that this 




9lbid., p. 3-5. 



affects the scores. 



TABLE II - 1 



INDEXER CONSISTENCY SCORES RECORDED IN 
HOOPER STUDY 10 



Indexer Consistency 
Score 



Hooper’s Designation 



10 $ 

18$ 

24 $* 

35 - 45 : $ 

36 - 59 $ 
4o$* 
42$* 
46$ 
48$* 

59$ 

70$ 

70$* 

73$ 

8o$* 



Study b (Jacoby) 

Study i (MacMillan and Welt) 
Study a (Rodgers) 

Study c (Slamecka and Jacoby) 
Study m (Korotkin and Oliver) 
Study e (Painter AEC) 

Study g (Painter 0TS) 

Study n (DDC) 

Study d (Painter ASTIA) 

Stud5^ 1 (Rodgers) 

Study k (Kyle) 

Study f (Painter NAL) 

Study q (Bryant, King and 

Terragno) 

Study o (Hooper) 



*Studies for which Hooper recomputed the scores using 
his own formulas on the raw data found in the studies. 



These studies wei’e of interest in the situations in 
which they were done. They presented information of value 
to the investigators who conducted them. But they present 
an uneven base from which it is difficult if not impossible 
to draw any generalizations on indexer consistency except 
that, as previously studied, indexer consistency presents an 
inconsistent character. 



10 



Ibid., p. 12-19. 



Individual Tests cf Indexer Consistency 



Rodgers Study 

One of the earliest of inter-indexer consistency 

studies is that by Dorothy J. Rodgers., completed in 1961* 

She selected twenty articles concerned with the organization 

of information for storage and retrospective search. One 

of the reasons these articles were selected was 

. . . that H. P. Luhn had published his computer- 
generated 'auto-abstracts’ and keywords selected on 
the basis of frequency from this set of documents. 

This made it possible to compare the words selected 
by ISO technicians with those selected by Luhn's sta- 
tistical system. H 

(ISO technicians are technicians who work in the Information 
Systems Operation., a part of the General Electric Company. ) 

Eight individuals indexed these twenty articles by 
selecting "those key words from the documents that he might 
later use in retrieval. These were literally single 
words, acronyms, or in one case, a personal name. 

The words selected by each of the eight were then 
compared and various analyses were conducted based on the 
degree of replication in the keywords chosen by each of the 
analysts; the number of keywords chosen by each; the length 
of the article in relation to the number of keywords chosen; 
the physical position of these words in the document (whether 
they appeared in the title, sub-title, abstract, or the body 



11 

Dorothy J. Rodgers, A Study of Inter- Indexer Con- 
sistency (Washington, D.C. : General Electric Company, 196ll , 

p. 8. 
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Ibid., p. 10. 



of the text); and the proportion of words selected both by 
Luhn's frequency count procedure and by the human indexers 
out of the total universe of keywords selected by both 
methods. 

Rodgers states that "Consistency is here defined as 
the number of topics which two or more indexers independently 
select as an important topic from an article. "~3 The word 
"topic" is not defined, but it is apparent that "key word" 
and "topic" are viewed as interchangeable by Rodgers since 
all the analyses are based on similarity or dissimilarity of 
key words. She also states in her summary that "The key 
words selected were analyzed to determine the degree of 

ii 14- 

agreement among indexers in terms of choice of key words. 

It appears that degree of agreement in choice of individual 
text words was the criterion for the establishment of degree 
of indexer consistency. This was, of course, not really a 
test of indexer consistency in a precise sense, since the 
objective was to choose keywords that the person might later 
use for retrieval, not terms for index access (terms which 
may have been composed of more than one word). 

The mean inter-indexer consistency score for the 
eight indexers and the twenty articles in the study was 24$. 
Consistency scores for each article ranged from 16$ to 

Ibid. , p. 6. 

■^Ibid. , p. 21. 
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The mean consistency score Tor Lutin' s method in 
relation to the human indexers was 15 $. ^ 

Agreement in choice of terminology was the criterion 
for the establishment of degree of indexer consistency. 

Fainter Study 

One of the better known of the indexer consistency 
studies listed in Appendix A is that done by Fainter as a 
part of her doctoral dissertation.^ For the purposes of 
her study, various government agencies re -indexed reports 
they had indexed previously. The Office of Technical Ser- 
vices re-indexed thirty-two items; the Armed Services 
Technical Information Agency, ninety-four; the Atomic 
Energy Commission, ninety-six items; and the National 
Agricultural Library re -indexed ninety-nine items. There 
was no attempt to have the indexer who had originally indexed 
the item re -index it. 

Indexer consistency was defined as a match in termi- 
nology. Singular and plural forms or adjectival and noun 
forms of the same word were considered as matching. The 

15 Ibid., p. 54. 

l6 Ibid., p. 59. 

17 

Ann F. Painter, Analysis of Duplication and Consis- 
tency of Subject Indexing involved in Report Handling at the 
Office of Technical Services, U, S. Department of Commerce 
(Washington, D.C.: U.3. Office of Technical Services, 

1963). 
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highest consistency recorded was 72 % at the National Agri- 
cultural Library ; the lowest was 44^ at the Atomic Energy 
Commission . 

This w ide variation in indexer consistency scores 
occured despite the fact that Painter used the same techniques 
and definition of indexer consistency throughout her study. 

Both the National Agricultural Library and the 
Atomic Energy Commission used lists of authorized terms 
as indexer aids. The Atomic Energy Commission used a 
traditional subject heading system. The National Agri- 
cultural Library used *:he subject headings established in 
the subject index to the previous year's Bi bliography of 
Agriculture . 

Painter states, that 

The duplicate indexing investigations tabulated 
and studied . . . were attempts ... to determine the 

degree of equivalency in the terminologies. Essen- 
tially the comparisons were made of matches, which v;ere 
similar in appearance rather than concept ( synonymous) , 
but where different words were used for the same con- 
cept there was some attempt to record the fact. For 
the most part, it includes only the straight word-for- 
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1 A 

word match allowing for grammatical dif ferencas. 

Painter was aware that more than one term could be 
used to label a particular concepts but chose to base her 
judgements of indexer consistency in this study primarily 
on terminology 0 This is* of course., in keeping witn other 
studies of indexer consistency. Allowing for grammatical 
differences, here as elsewhere., may be a partial recogni- 
tion that consistency in concept identification does not 
necessarily result in consistency in terminological expres- 
sion. Here, as elsewhere., however* the two are commingled 
in the final results, 

Saracevic and Goldwvn Study 

In this study by Saracevic and Golawyn*^, fifty 
abstracts were indexed using keywords as the indexing lan- 
guage, Indexers were divided into four groups of experi- 
enced indexers (these groups were based on the type of 
indexing language the Indexers had used previously) and a 
fifth group of inexperienced indexers. 

The inter-indexer consistency for one indexer 
with all other indexers in the group v; as calculated 
by taking the mean Indexing Consistency measures of 
that particular indexer with every other indexer in 

l8 Ibid, 3 p. 100. 

1 °Tefko Saracevic ana A. J. Goldwyn, An Inquiry 
into Testing of Information retrieval Systems, Part I: 
Objectives', Methodology V Design and Controls (Cleveland, 
Ohio: Case Western Reserve University Center for Documen- 

tation and Communication Research, 1968). 
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the group .... 

thus setting up pairs of indexers in which each, indexer 
was paired vn.th every other indexer in his group. A 
simple formula was used to arrive at a measure of consis- 
tency for each pair of indexers: 

Number of Terms in Agreement 

Indexing Consistency = 

Total Number of Unique Terms 

A match in terminology (keyword) was the only criterion for 

indexer consistency. No indication is given in the paper 

to show whether "keyword" in this case meant individual 

words 5 or included multi-word terms. 

Average inter-indexer consistency ranged between 
°1 

34.9$ and 63.5^.'" There was no attempt to arrive at a 
measure of consistency in identification of indexable con- 
cepts. 

The formula used by Saracevic and Goldwyn is both 
simple ana effective. The formulas used to measure consis- 
tency of both concept and terminology in the investigation 
described in this report are based directly on it. 

Jacobv and Slamecka Study 
Jacoby and Slamecka contrasted the indexing of ex- 

pp 

perienced and inexperienced indexers. They also 

20 Ibid., p. 117. 

21 Ibid., p. 119. 

22 J. Jacoby and V. Slamecka Indexer Consistency 
Under Minimal Conditions (Bethesaa* Md. : Documentation^ Inc. 

1962 ). 



measured indexer consistency by degree of match in termi- 
nology, "the consistency with which indexers tend to choose 
the same terms as being descriptive of the same documents. " 2 3 
They first measure this "under artificial conditions 
which excluded the use of indexing tools., communication, 

oh 

and post-indexing editing . . . . Later, they measured 

the intra-indexer consistency of the indexers when "re- 
indexing ’equated’ documents and using a vocabulary of 
’general' (shared) terms. " 2 5 Consistency rates for these 
studies ranged from 41 fo to 



Tinker Studies 



Tinker has reported two studies relating to indexer 
consistency. 2 ^* 27 His primary focus was on precision of 
meaning in terminology and he equates the consistency with 
which indexers applied certain terms to a given document to 
the precision of the indexers’ understanding of the meaning 
of the terms. 

Through measuring the consistency with which a 
term is applied to a concept , we are able to assess 
whether or not its meaning is understood with preci- 
sion. By having a number of abstracts indexed by a 



23 Ibid. , p. IV. 

24 Ibid. 

2 5lbid. 

2t %ohn F. Tinker, "imprecision in Meaning Measured 
by Inconsistency of Indexing, ' American Documentation XVII 
(April 1966): 96-102. 

2 ^John F. Tinker, "imprecision in Indexing, Part II," 
American Documentation XIX (July 1968): 322-30. 



number of people* it is possible to discover the con- 
sistency with which a given indexing term was used and 
hence * how well the meaning of the term was under- 
stood. 

He uses the degree of indexer consistency in use of termi- 
nology as a means of measuring indexers’ degree of under- 
standing of the precise meaning of the terminology. 

In the first study reported* fifteen indexers were 
asked to choose descriptors for fifty abstracts. They 
were not given a list of terms or any instructions for 

making a cnoice of terms. This resulted in a lint of 

\ 

1*050 different words or phrases. When a selected list 

\ 

of one hundred of these words or phrases was given to the 

\ 

same indexers and they applied these to the same fiftyt 
abstracts* Tinker states that: "The consistency of appli- 

cation increased markedly* and 6 of the terms we re used \ 

H °Q 

with perfect precision. 

Tinker 

. . . proposes that meaning can be defined as the 
relevance of a word to the concept that it labels 
.... By assigning a descriptor [Tinker defines 
T descriptor 1 as a synonym for 1 index terms'] to a 
document* the indexer asserts that the descriptor 
has a high degree of relevance to the content of the 
document ; that is* he asserts that the meaning of 
the descriptor is strongly associated with a concept 
embodied in the document* and that it is appropriate 
for the subject area of the document. Let us assume 
that the indexers assign the descriptors in the order 
of the degree of relevance to the concepts* or that 
they assign all of the descriptors that they believe 
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28 Tinker, op. cit . , (1966), p. 97- 
29 ibid., p. 101. 
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have a high degree of relevance. Then the consistency 
with which a given degree of relevance is associated 
with a given descriptor-concept pair will reflect me 
precision of the association strengths. Hence, con- 
sistency of indexing serves as a measure of the preci- 
sion of meaning. o® 



Tinker assumes that only one of his 100 indexing 

terms will be a "precise" surrogate or label for a particu- 
lar concept in the abstracts indexed. He equates the 
assignment of this term to the abstract as an indication 
that the Indexer perceived an exact one-to-one relationship 
between the concept and the term. He assumes that in a 
given field of knowledge there may be degrees of relevance 
of terms to concepts, but that in his list of indexing terms 
there is one term which will have a 100$ association factor 
with a given concept in the abstracts indexed. 

This is why he equates the consistency with which a 
group of indexers assign a term to an abstract with the de- 
gree to wh: ? ch the indexers understand tne meaning of the 
term precisely. If all Indexers apply or fail to apply the 
term, there is 100$ precision of meaning in their under- 
standing of tne term. If they are divided in their 
application or non-application of the term, there is not 
100$ precision in their understanding of the meaning of the 

term. 

Tinker states that the findings of his 1966 study 
indicate "that a drastic reduction in the number of allowed 



30 



Ibid., p. 97. 



Indexing terms would increase the precision with which the 
terms would be used." 3 ^ It seems obvious, of course,, that 
if the number of possible choices in terminology are re- 
duced from near infinity to 100, or even from 1,000 to 100, 
the statistical odds on choosing the same terms would in- 
crease significantly even if all o tner factors were equal* 
In the 1968 study. Tinker begins by discussing the 



findings of his 1966 study, but states that: 

... a limited and inflexible set of indexing terms 
has serious disadvantages ... a small set of index- 
ing terms is limited in the richness of description 
it is capable of. Clearly, limiting the choice of 
indexing terms to a small set is unsatisfactory. -3 

Tinker therefore established a small set of indexing terms 

for the use of the indexers in the study, but allowed them 



to add modifiers to the terms. 

.... the indexer was required to choose broaa terms 
* for a snort list, then freely assign modifiers to the 
terms, so that the combination of terms and modifiers 
described the document and distinguished it from the 
others in the file. 33 

In the study reported in 1968, Tinker assigned 
thirteen abstracts of articles In the field of photographic 



science to nineteen indexers. 

The indexers were given an authority list of only 
34 terms, which together form a classification of 
photographic science. They were asked to choose de — 
scriptors from this list and freely add modifiers. u 



•^Tinker, op. cit», (1968), p. 322. 
32 Ibid. 

33ibid. 

3 ^Ibid. , p. 326. 



Tinker's objective was to learn whether an authority list 
to which indexers might freely add modifiers would increase 
or decrease precision of meaning as indicated by the con- 
sistency with which the indexers assigned a given term to 

i 

i 

a given text. 

Tinker states : 

If all the indexers have the same understanding 
of the meaning of a term, they will unanimously apply 
it, or fail to apply it* to each abstract. The ex- 
tent to which they deviate from this unanimity is 
shown on a graph showing the fraction of indexers 
applying the descriptor as the ordinate. The ab- 
scissa of the graph is the rank of an abstract, so 
that the curve rises to the right. We can define 
perfect understanding and perfect precision of mean- 
ing as yielding a rectangular curve — one with points 
only at 0 and 100$. 

Tinker gives, as an example, a graph derived for the des- f 

\ 

criptor: emulsion technology. 

i 

It is a term that would be expected to have high { 

precision among these indexers, since it describes a 
subject area in which they are competent. The graph 
shows that the term is not used with perfect precision, . 

since it is not a rectangular curve. Furthermore, the 
imprecision is about the same as is observed when terms 
are chosen freely .... [As in the 1966 study] 

The use of an authority list, in the way we have 
explained, does not increase the inherent imprecision 

of words. 35 v 

One would also have to add that it did not appear to decrease 
it. 



Tinker was not studying indexer consistency in these 
investigations. However he used degree of indexer consis- 
tency as his criteria for the measurement of degree of 
precision in meaning. 
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Ibid., p. 329-330. 



42 

43 



> 



Tinker’s studies assume that consistency of index- 
ing is dependent on replication of terminology. He 
states: "If all the indexers have the same understanding 

of the meaning of a term, they will unanimously apply it* 
or fail to apply it, to each abstract. "36 He fails to 
state that they may not apply it if they do not see it as 
expressing the indexable matter in the text. He is also 
assuming that indexers will perceive the same content 
although they may express it differently, and that only one 
term in an authority list is appropriate for one concept. 
This is not necessarily so. It is possible that not only 
will indexers use different words for the same concept and 
use the same word for differing concepts, but, based on the 
data of this study, they may also disagree on which con- 
cepts in a given text are indexable. Perhaps Tinker’s use 
of abstracts rather than full texts as the documents to be 
indexed has some bearing on this matter. Although Tinker's 
studies are among the most interesting of the studies ot 
indexer consistency, once again, indexer consistency is 
measured only in terms of replication of terminology. 

Zunde and Dexter Studies 

Zunde and Dexter have also reported two studies of 
indexer consistency. The first, reported in 1969* was con- 
cerned with developing a measure of indexer consistency 



3 6 ibid . , p. 329. 



which would "assign a higher consistency value if indexers 
agree on the more important terms than if they agree on less 
important terms. "37 The degree of importance of a particular 
term in relation to the content of a particular text was de- 
fined as equal to 

. . . the degree of consensus of indexers in selecting a 
term .... Tn other words, the more indexers select a 
given indexing term, the more representative it should bgo 
considered with respect to the contents of the document. -3° 

Zunde and Dexter conclude: 

Measures of indexing consistency should reflect not 
only the formal agreement of indexers on a number of terms,, 
but also the significance of terms on which the indexers 

agree or disagree. 39 

Zunde and Dexter thus opened a meaningful area for in- 
vestigation. Indexer consistency in choice of highly signif- 
icant terms is certainly more important than indexer consist- 
ency in choice of less significant terms. The problem lies 
in the definition of "significant". If a "significant" term 
is defined as one which has been chosen by two or more index- 
ers, can indexer consistency in choice of "significant term" 
be defined as the degree of duplication in the terms chosen 
by two or more indexers? This would seem to be circular rea- 
soning, defining each variable in terms of the other. 

Zunde and Dexter used two equations to measure 
indexer consistency in this study. The equation which 
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37 Pranas Zunde and Margaret E. Dexter, "indexing Con- 
sistency and Quality," American Documentatio n XX (July 1969)2 

250 * 

■3°Ibid . , p. 262. 

3'-'Ibid., p. 2 66. 
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. . . reflects the agreement of a group of indexers 
on the significance of the selected terms, produced 
on the average higher consistency values than the 

measure given by . . . [the second equation] . . . 

which does not reflect any judgement of significance 

of the terms. 40 

Twenty-nine biomedical documents were indexed by eight pro- 
fessional indexers and eight scientists; and nine student 
indexers indexed sixteen documents. In the first instance, 
a list of terms was supplied to which the indexers could 
freely add terms. In the second instance, no list of 
terms was supplied to the indexers. It Is not clear from 
the report what effect, if any, this had on consistency 
scores since it is not considered separately from other 
variables in the study. Consistency scores ranged from 
less chan 10# to 39% A 1 

The second study reported by Zunde and Dexter^ 
investigates the relationship between the readability of 
a document and consistency or quality of indexing as mea- 
sured by the equations developed in their first study on 
the data used in their first study. (The measure of reada- 
bility used is the one proposed by Rudolph Flesch in 
1948. 43 ) 

2,0 Ibid., p. 263. 

4l Ibid. 

^ 2 Pranas Zunde and Margaret E. Dexter, "Factors 
Affecting Indexing Performance , Proceedings of the American 
Society for Information Science. VI ( 1969)1 313 322. 

^3R U dolph Flesch, "A New Readability Yardstick," 
Journal of Applied Psychology, XXXII (1948) : 221-233. 
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The above study also investigated the effect of the 
temperature of the "work area on the indexing performance of 

a group of graduate students indexing Reader's Digest ' 

articles . 

Neither the readability of the documents nor the 
room temperature were shown to influence indexer consistency 
to a significant extent. 

Both Zunde and Dexter Studies define consistency of 
indexing as 

. . . the degree of agreement within a group of 
indexers in the representation of essential infor- 
mation content of the document by certain sets of 
indexing terms selected individually and indepen- 
dently by each of the indexers in the group. ^ 

Once again, replication of terminology is the criterion for 

the definition of indexer consistency. 



Cooper Study 

Cooper's study^5 differs from the ones cited pre- 
viously because it is not based on actual indexing. Rather, 
it is a closely reasoned discussion based on various mathe- 
matical models and equations. However, in common with all 
the other investigators previously cited, Cooper used con- 
sistency in choice of index terms as the basis for his 
definition of indexer consistency. 

^Zunde and Dexter, op. cit., p. 313* 

^William S. Cooper, "is Interindexer Consistency 
A Hobgoblin?”, American Documentation XX (July 19-9) • 2-8- 
278 .. 
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For any allowable index term, there will be a 
certain proportion (possibly none) of the indexers who 
have assigned the term to the document, and a remain- 
ing proportion who have not. We define the inter- 
indexer consistency with respect tc the given term 
and document to be the larger of these proportions 
minus the smaller. , . . For example, if STfo of the 
indexers assign the term to the document, the consis- 
tency is C = 9 0 $ - 107 j =■ SO for that term. Also, 
if 90 p of the indexers do not assign the term to the 
document, the consistency will again be 80 $, for it is 
only the amount of agreement which is of interest, not 
the nature of the agreement. The definition assigns 
a consistency rating of 100^ (the maximum possible) 
in case all the indexers are unanimous in assigning 
the term to the document and likewise 100$ in case 
they are unanimous in not assigning the term. 

Cooper continues his discussion and explores various other 
aspects of the problem of indexer consistency, but in accor- 
dance with other investigators, he defines indexer consis- 
tency as consistency in terminology, which represents both 
choice of concept and means of expression. Concept choice 
is, however, implicitly considered as Cooper introduces the 
idea of non-use of a term as part of consistency. He never 
expresses this, however, in terms of the two distinct opera- 
tions in the indexing process. 

Cooper’s statement that 

. . . the phenomenon of interindexer consistency is 
devoid of practical interest unless it can be shown 
that it has something to do with indexing quality and 
ultimately with retrieval effectiveness. . . fit 

should certainly also be mentioned here. He is right in 

contending that studies of indexer consistency are of little 



Ibid . , p. 271. 
^Ibid., p. 268. 
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interest unless indexer consistency can be related to re- 
trieval effectiveness. 

He states that if inter indexer consistency is 
improved at the expense of indexer-requester consistency* 
information retrieval effectiveness will be impaired. 

That is* if indexers in a given information retrieval system 
become consistent in their assignment of index terms* but 
these terms differ from the terms used by the system’s 
patrons in their requests for information* then the goals 
of the information retrieval system and the effectiveness 
of information retrieval will be impaired. He hypothesized 

that: 

If method B produces a higher level of interindexer 
consistency than A* and at the same time the indexer- 
requester consistency attained under B is as high.as 
that attained under A* then the use of B results in 
greater retrieval effectiveness than the use of A.^o 

His conclusions are that although at present* not enough is 

known about indexer consistency for it to be used as a. gauge 

of indexing quality* it "has a definite and mathematically 

analyzable relationship with retrieval success."^ 

It is possible that a situation might occur in which 
an indexing term is assigned to a given article by an indexer 
even though it is not an accurate label (in a dictionary 
sense) for the particular subject concept it is meant to 
characterize. However* if there is a good syndetic appa- 

^ 8 IMd., p. 270-1. 

^IMd., p. 277. 
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ratus, or if the requesters are aware that this particular 
index terra is assigned consistently to identify this parti- 
cular concept, the requesters will use it when they want to 
retrieve information on that subject. An analagous example 
of this has been described by Herner as follows: 

The Library of Congress, for years, classified 
computers under C alculating Machines , completely 
ignoring non-numericai applications j however, you 
could always depend on boohs on compute: s being 
shelved with books on calculating machines in libra- 
ries using the LC class if icat inn and this made it a 
system. It was dependable -- or perhaps consis- 
tently undependable would be better. 50 

In this case, and probably in many others, it was more im- 
portant for the label assigned to the subject to be assigned 
consistently than it was for it to be assigned accurately. 

In other words, it may be extrapolated that indexer-requester 
consistency may be enhanced when indexers are consistent in 
their assignment of terms to subject concepts if the re- 
questers are aware of the way in which the term is assigned, 
whether or not the term is assigned accurately in a dic- 
tionary sense. In addition, the development of consistency 
in the sense of predictability is essential for scientific 
analysis of indexing and the development of the art. It may 
be assumed that the goals are both quality and predictability, 
since if no attention is paid to quality (i.e., value in 
locating information for real information seekers) achieve- 
ment of complete predictability is a trivial goal. 

5°Saul Herner, "System Design, Evaluation, and 
Costing," Special Libraries LVIII (October 1967)2 577* 
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Harris ., Hayward, and S^cncnius Study 
Harris, Hayward and svenonius teheed inter- ^ndt-xcr 
consistency at various indexing depths. In their ^tudy, 
nine people each indexed three articles. 

each person indexed each article with 50 terms, 
a term being a phrase of net more than 3 words. To 
see if depth of indexing was related to consistency 
each list of [50 terms was ordered by 6 depth levels: 
depth T consisted of those 5 terms which would have 
been used to index the article if only 5 terms were 
allowed ; doptli II consistsd of 10 tertns} depth TII^ 

20 ; depth IV, 30 ; depth V, 40; depth VI, 50 . (It 
was somewhat questionably assumed that given 10 terms 
to index an article, these 10 must include the 5 ^ e; Tcp 
which would be chosen if only 5 terms were allowed. p 

Two of the three articles were two pages long; one 

article was five pages long. 

Using fifty terms to describe the content of an 

article two pages long is an unusual indexing practice, but 
aside from this, the study is of interest because the inves- 
tigators deliberately varied their definition of "consistency 
to ascertain the effect this would have on their measure of 

percentage of indexer consistency. 

They first define inter— indexer consistency as the 
"number of like terms selected by different people when 
indexing an article. . . percentage of exact ('machine-like ) 

matches . . . Then they change this definition to 

include successively 



51j) . Harris, W. B. Rayward and E. Svenonius, The 
Testing of Inte r-Indexing Consistency at Various Indexing 
Depths ( Chicago t University of Chicago Graduate Liorary 
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1. Trivial variations in terms such as singular and plural 
forms of the same word, 

2. Synonyms, 

3. Hierarchically related terms. 

Findings were that 

^ . . variant match consistency shewed on the average 
67 > improvement over exact match consistency. There 
was very little improvement using synonyms. Consis- 
tency based on matching hierarchically related terms 
was on the average twice as high as variant -match con- 
sistency and three times higher than exact match 
consistency . . .54 

The following table, from an unnumbered page preced- 
ing page 7 , gives the percentages of consistency they found. 



TABLE II - 2 

percentage consistency and depth of indexing 

AS RECORDED IN HARRIS, HAYWARD, AND 
SVENONIUS STUDY 



Depth Level 


Exact 


+ Variant 


+ Synonyms 


+ Hierarchy 


1 

(5 terms) 


13 


24 


26 


45 


II 

(10 terms) 


18 


22 


23 


48 


III 

(20 terms) 


12 


18 


18 


42 


IV 

(30 terms) 


13 


20 


21 


43 


V 

(40 terms) 


16 


19 


23 


48 


( Percentages 


for depth 


Level VI were 


not given) 


r 



5^ ibid. , p. 6. 
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Varying the definition of "match" to include sync 
nyrns and hierarchically related terms may have been a way 
of attempting to include concept consistency in the measure- 
ments. Tills was not stated, however. It is unfortunate 
that this study represents work done on a sample of only 
three short articles. 

The Harris, Hayward-, and Svenonius study illustrates 
what is hinted at in many of the other studies. 

1. As the definition of "indexer consistency" is varied 
from an exact word-for-word match in terminology to include 
matches that are more broadly defined,, the resultant percen- 
tages of consistency rise. This is in keeping with the 
findings of this study., and this broadening of the defini- 
tion of a "match in terminology" , here as in some of the 
ocher studies, may be thought of as an indirect attempt to 
solve the problem directly attacked in this study. 

2. Although previous studies of indexer consistency state 
they are measuring consistency in terminology, the effect 
of the varying definitions of indexer consistency used in 
the studies results in scores that are composed of mixtures 
of the two kinds of indexer consistency identified in this 
study, scores in which the two kinds of consistency are 
present in differing and uncontrolled degrees. 

The investigators of these previous studies were 
unwilling to accept a word-for-word match in terminology as 
a definition of indexer consistency. However,, they did not 
consciously use the distinction between the two parts of the 
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indexing process as the basis for a new definition, 
re suit , as stated previously, is that the definitions they 
used and the scores they reported represent an undifferen- 
tiated rnix of the two kinds of indexer consistency that are 
consciously considered separately, defined separately, and 
measured separately in this study. 

Investigations of Indexing Methodology 
in Which Concept Categories Based 
on Synonymy Were Established 

The Harris, Rayward, anc Svenonius study .is the only 
previous study of indexer consistency that considered and 
measured degree of synonymy of terms as a clearly defined 
variable. Although other reported indexer consistency 
studies have not investigated inc.exer consistency in per- 
ception of concepts except as an undifferentiated part of a 
general measure of indexer consistency, studies of other 
areas of indexing methodology have intentionally used con- 
cept-based, rather than word -based categories. Two of these 
studies are discussed at the end of Chapter IV in the detailed 
discussion of the concept categorization process used in this 
study. 

These studies55> 56 investigated the degree to which 
the words in the title of an article might be said to repro- 

55christine Montgomery and Don R. Swanson, "Machine- 
Like Indexing by People, " American Documentation XIII (October 
1962): 359-366. 

■^Donald H. Kraft, "A Comparison of Keyword-in-Context 
(KWIC) Indexing of Titles With a Subject Heading Classifica- 
tion System, " American Documentation XV (January 1964) : 48-52. 
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duce the subject headings that had been assigned to the 
article by a human indexer. The objective was to investi- 
gate the feasibility of a KWIC or KWOC index for the titles 
of the articles. 

In these studies, if a word or phrase in the title 
matched a word or phrase in the subject heading, or if they 
had the same semantic root, they were considered a "match". 
This is similar to the kinds of "matches" used in previous 
studies of Indexer consistency. In addition, however, the 
investigators included in their definition of "match" words 
that belonged in the same hierarchal group, and the investi- 
gators also established certain words or phrases in the titles 
as being synonymous or "logically equivalent" to the subject 
headings that had been assigned to the article. That is, 
these words were said to characterize concepts synonymous to 
the concepts characterized by the words in the subject head- 
ing. The words or phrases that had been included by the 
investigators in these synonymy-based categories were con- 
sidered a "match" with the subject headings for which the 
"synonymous" or "logically equivalent" relationship had been 
established. 

In some ways, the categories established in these 
studies are similar to the concept categories established 
for this study. This is discussed more fully, as stated 
above, at the end of Chapter IV. 
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Major Differences Between Previous Indexer 



Consistency Studies and the Study 

Reported in This Dissertation 

■ ■ 4 — — - — — . 

The indexer consistency studies listed in Appendix A 
and the indexer consistency studies discussed in this chap- 
ter have defined indexer consistency (when it was defined) 
as the consistency of various degrees of replication of 
terminology. Only the Harris et al study departed from 
this. 

The definitions given for "match" or replication of 
terminology vary from study to study and the definition is 
deliberately varied within some studies. This would seem 
to indicate that the investigators were not satisfied with 
the definitions of indexer consistency given in the litera- 
ture and that for these studies, the concept of "match" is 
not the concept normally meant by the term "match". This 
may well reflect an unexpressed realization that these defi- 
nitions were not distinguishing between degree of indexer 
consistency in perception of indexable matter and degree of 
indexer consistency in terminology. 

This study defines indexer consistency as being com- 
posed of two parts: 

1. Indexer consistency in the perception of indexable matter; 

2. Indexer consistency in the choice of terminology with 
which to label the indexable matter perceived. 

The purpose of the study is to demonstrate tha o these 
two parts may usefully be considered separately, chat each 
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may be present in differing degrees, that this distinction 
has not been analyzed in previous studies, and that the 
distinction offers useful avenues of approach to indexing 
problems 
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CHAPTER III 



METHODOLOGY 

Introduction 

Previous studies of indexer consistency have defined 
inter-indexer or intra-indexer consistency in terms of de- 
grees of replication in the indexing term or terms chosen 
by one indexer at two or more separate points in time, or by 
two or more indexers working independently, to characterize 
the informational content of a given text or texts. 

This definition of indexer consistency does not take 
into account the distinction made by Bernier, Lancaster, and 
Shaw and Rothman in their analyses of the indexing process 
quoted in Chapter I. These analyses distinguish between 
the concepts that indexers perceive as indexable matter in a 
given text and the term or terms that these indexers choose 
to characterize these concepts. 

The objective of this study was to determine whether, 
for a given group of indexers, the extent of the degree of 
agreement in their perception of concepts in texts would 
differ from the extent of the degree of replication in the 
term or terms they chose to characterize the concepts they 
perceived. 
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Bas i c Ansumpt ions 

1, Indexing is an order-dependent technique in that 

concept must be perceived before it can be expressed in 
an index term. 

2, Perception of concepts is a process distinct from 
the process of choosing terms witn which to characterize the 
concepts perceived. 

Hypothesis 

The degree of indexer consistency in the perception of 
indexable matter can be measured separately from and will be 
different in extent from the degree of indexer consistency in 
the terminology chosen to characterize that indexable matter. 

Se lection of Sample of Articles 

Five hundred-fifty articles in the field of informa- 
tion science and library science were chosen' as the textual 
materia: to be analyzed in this study. This sample is large 
enough for the results to be designated as statistically 
valid. It is much larger than the number of texts analyzed 
in previous studies. 

The subject area was chosen because it is one that is 
familiar to the investigator and would be familiar to the 
people who would be employed as indexers and categorizers. 

The one hundred journal articles chosen for use in 
the first part of this study were selected according to the 
following criteria. 
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1. They v;cre to be a random sample chosen from the articles 
abstracted in Documentation Abstracts, II, No. 4 (1967). 

2. Each abstract chosen represented a journal article pub- 
lished in English, Articles published in Proceedings were 
excluded. 

To secure a random sample of a universe that has 
been or can be numbered, an appropriate series of random 
numbers is usually selected from a Table of Random Numbers 
and these numbers are then used to draw the sample from the 
larger universe. This was the procedure used to select the 
sample for this part of the study. From the 273 numbered 
abstracts published in Documentation Abs tracts, II, No. 4 
(1967), abstracts were c u ~*en that satisfied the requirements 
stated in 1 and 2 above and whose last three digits corres- 
ponded to succeeding numbers in the Table of Random Numbers 
(8,000 Numbers) published in Arkin, Herbert and Raymond R- 
Colton, Tables for Statisticians (New York: Barnes and Noble, 



Inc., 1963) 168 p. College Outline Series No. 75> until a 
total of one hundred abstracts, for which it was possible to 
obtain the original articles from the collections of the 
Columbia University Libraries, the New York Public Library, 
and the Pratt Institute Library Service Library, had been 



ootained. 

After completion of this part of the study, circum- 
stances made it possible to expand the number of articles in 
the study universe and thus to decrease the margin of sampling 
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error. An additional 450 articles were added to the study 
universe. 

These additional 450 articles represent all of the 
English language journal articles abstracted in D ocumentation 
Abstracts , II, Nos. 1, 2 and 3 (19&7) which were available 
from the sources mentioned above. 

Characteristics of Articles in the Study 

All of the articles in the study were concerned in 
some way with lib rarianship, documentation, and information 
science. They ranged in type from generalized discussions 
with little hard, identifiable data, to articles which were 
little more than lists of data. Tney included articles on 
broad, inclusive subjects and also those which treated narrow 
topics in depth. Some of the articles were within the com- 
prehension of the average high school student. Others were 
of such a complicated nature that some of the analysts had 
trouble in understanding them completely. 

The sample of 550 articles was divided into 22 groups 
of 25. Each group was so chosen as to contain examples of 
the various types and levels of articles. Where abstracts 
had originally appeared with the article, they were deleted 
so as to prevent their content from affecting the judgement 
of the analysts. 



Selection of Concept Analysts 
The people employed in the first part of the data 
gathering stage of this study are called '‘concept analysts" 
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the 



"analysts” in f he 
texts used in the 



study because their 
study and analyse 



task was to read 
them for concepts. 



They performed the first three steps of the indexing 
process as outlined by Eernier. In other words, they: 

1. Selected concepts suitable for indexing; 

2. Embodied the concept in a verbal paraphrase; 

3. Refined the verbal paraphrase into an "embryonic index 



entry". 

They were not asked to perform the fourth step in 
Bernier's analysis of the subject indexing process, that is, 
the translation of the "embryonic index entries" into the 
standardized terminology of an indexing system, although, 
in some cases, because of the background and training of 
the analysts, the terms they used are standard terms or 
standard terminology in the field of library and information 
science. 

Concept analysts were chosen from among volunteers 
who were attending or had graduated from Columbia University 
School of Library Service or Pratt Institute Graduate School 
of Library and Information Science. This was done for a 
number of reasons. 

1. The analysts could be expected to have some knowledge of 
and interest in the subject matter of the articles. 

2. They could be expected to have some familiarity with the 



terminology of the field. 

3. They were actual or potential users of the literature. 
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The work experience and educational background of the 
analysts was ascertained through use of a questionnaire 
(Appendix B) . The findings of this questionnaire are dis- 
played in Tables III-l and III-2. 

There were 34 analysts in all. Fifteen had 
Bachelors degrees only and were working toward a Master’s 
degree in Library Science. Eleven were either working to- 
ward Doctoral degrees or were Advanced Study students in 
Library Science. Six had already received Master’s degrees 
in other subject fields. 

Only two of the analysts had not had some work ex- 
perience in libraries or in library or inform t ion- oriented 
tasks. Twenty-four of the 34 had worked at some type of 
library or information-oriented to,sk for one or more years. 

No attempt was made to correlate indexer background, 
education, or work experience with the results of this study. 

Training of Concept Analysts 

The analysts were given a short (approximately 45 
minutes) indoctrination session in which a set of typed in- 
structions (Appendix C) was carefully reviewed. The 
analysts were also asked to analyze two articles in accor- 
dance with the instructions. 

The objective of the session was to train the analysts 

to record the verbal labels they would ordinarily use for the 
concepts they perceived as indexable matter in the articles. 
Because their verbalization of their perceptions was the goal. 
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TABLE III - 1 



ANALYST CHARACTERISTICS 



EDUCATION 



CHARACTERISTICS 

Bachelor’s degree only 
Master’s degree in library science 
Master’s degree in other subject 
Doctoral degree in other subject 

Undergraduate major 

English/English Literature 

History 

Psychology 

Foreign Languages 

Political Science 

A.sian Area Studies 

Philosophy 

Biology 

Sociology 

Arts 

Business 

Educate ::>n 



NUMBER OF ANALYSTS* 

15 

17 

6 

0 

11 

11 

4 

4 

2 

2 

1 

1 

1 

1 

1 

1 



Graduate Study 

Library Science 34 
English Literature 2 
Foreign Languages 1 
Art History 1 
International Relations 1 
History 1 
Anthropology 1 
Economics 1 
Social Sciences 1 
Religion 1 



*The numbers total more than 3^ because some analysts appear 
in more than one category. 
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TABLE III - 2 



ANALYST CHARACTERISTICS - WORK EXPERIENCE 



CHARACTERISTICS 



NUMBER OF ANALYSTS'* 



Worked in a library or done 
library related work for 

Less than one year 

One to three years 

Four or more years 

Never worked in a library or 
done library related work 

Type of library work 

Mainly clerical tasks 
Reference 

Cataloging and classification 

Administration 

Teaching 

Research 

Subject analysis 

Acquisitions 

Automation 

Circulation 

Indexing 

Abstrac ting 

Worked in bookstore 

Exhibitions 

Bindery 

Periodical Inventory 
Readers Advisory Services 
Children's Story Hours 
Systems Analysis 
Searching 



8 

11 

13 

2 

11 

23 

11 

12 

7 

7 

3 

11 

1 

15 

2 

2 

1 

1 

1 

1 

2 

1 

1 

1 



*The numbers total more than 3^ because some analysts appear 
in more than one category. 
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the instructions were' Kept in general, non-prescriptive 
terms except for the following. 

1 . A context for the analysis was given. The analysts were 

told to imagine they were working for an information center 

or Library collecting materials in the area of information 
science, documentation, and librarianship. The size of the 
collection was not specified. 

2. The analysts were instructed that the verbal labels they 
chose did not have to conform to any standardized list of 



terminology or to the author's words, but should be the words 
they would ordinarily use to describe the concepts they per- 
ceived as indexable matter. These might, of course, be the 
standardized verbal labels of a classification system, but 
they did not have to be. The analysts were not asked to 
produce formal index entries. 

3 . The analysts were asked to reflect the exact concept 
discussed. They were not to produce terms for a classifi- 
cation. They were to produce terms that accurately charac- 
terized the particular concepts they distinguished in the 



texts. 

4. An additional facet of the study was embodied in the last 
paragraph of the instruction sheet. This was the possibility 
that an analyst might be able to indicate what concepts were 
discussed without being able to understand what was being 
said about the concept. The analysts were therefore asked 
to indicate their degree of comprehension of the information 
in the article on the bottom of the data gathering sheet. 
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No analysis in regard to the indicated degree of comprehen- 
sion was done in respect to this study. 

In addition to the instructions on the printed sheet, 
the analysts were all told orally to keep firmly in mind the 
distinction between the mere mention of an informational con- 
cept in the article and the discussion of actual information 
about the concept. They were only to include verbal labels 
for the subjects on which enough information was given to 
satisfy the needs of a patron wishing substantive information 
on the subject. 

After a thorough reading and discussion of the in- 
struction sheet, each analyst was asked to analyze two 
articles in the presence of the investigator. Their analyses 
were discussed in relation to the work that they were being 
asked to do. At no point were suggestions made as to what 
subjects should or should not have been included in their 
analyses. Throughout the short training sessions, the in- 
vestigator stressed that what was sought was the analysts’ 
perceptions of the content of the articles as expressed in 
their own verbal labels. 

Data Gathering Procedure 

After the short training experience described above, 
each analyst was then given a packet containing: 

1. Copies of twenty-five of the serial articles in the 
sample j 

2. A copy of the training instructions $ 
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3. Twenty-five data gathering sheets (Appendix D) . 

They were told tc analyze each article in accordance 

with the instructions and write the verbal labels for the 
concepts they identified on the dat . gathering sheets. 

When the work was completed, usually within two to four 
weeks, the analysts returned the completed packets to the 
investigator and were paid a previously agreed upon lump sum. 

Each packet was analyzed by fi'*c people. The data 
for this study, therefore, consist of 550 x 5 analyses (550 
serial articles, each analyzed five times), or 2,750 indivi- 
dual analyses in all. 

Data Analysis Procedure 

Procedure Used to Determine 
Consistency in Terminology 

The individual verbal labels created by each analyst 
for each article were compared, article by article, for match 
in terminology, i.e., matches in entire terms, which might or 
might not be multi-word terms. 

Definition of Consistency in Choice 
of Terminolog y 

An exact match in terminology was defined as a word- 
for-word match. Each verbal label had to contain the same 
number of words, each word had to be identical in grammatical 
morphology (i.e., ’’mechanize" and "mechanization” were not 
considered a match) with its counterpart in the comparable 
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verbal label, and each word had to occupy the same position 
in the comparable verbal label for the veroal labels to be 
termed a match in terminology. 

Functuation was ignored, e.g. "Library schools curri- 
culum" and "Library schools, curriculum" were considered a 
match; singular and plural forms of the same word were con- 
sidered a match; abbreviations were considered a match with 
the words abbreviated; possessives were considered : j - match 
with the non -possessive form, e.g. "IBM Watson Library" was 
considered a match with "IBM’s Watson Library"; and diffe- 
rences in capitalization and spelling were ignored, e.g. 
"Aeroplane" and "airplane" were considered a match in ter- 
minology. 

The rather strict definition of consistency in ter- 
minology used in this study accounts, to some degree, for the 
low percentages recorded for consistency in terminology. 

When a looser definition of consistency in terminology was 
experimented with (a match in terminology was said to occur 
when the first two substantive words in the verbal labels 
were the same), and the formulas presented later in this 
chapter were used to compute the terminology consistency 
scores, the percentages of consistency in terminology rose. 
However, in the few cases in which this "loose" definition of 
terminology consistency was experimented with, the resulting 
percent of consistency in choice of terminology still never 
approached the percent of consistency in choice of concept. 
Table III - 3 displays the results of this experimentation 
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v;ith the two different definitions of consistency in termi- 
nology. 



TABLE III - 3 

CONSISTENCY SCORES RESULTING FROM THE USE OF TWO 
DIFFERENT DEFINITIONS OF TERMINOLOGY CONSISTENCY 



ARTICLE 


MEAN CONCEPT 


MEAN TERMINOLOGY 


MEAN TERMINOLOGY 


NUMBER 


CONSISTENCY 


CONSISTENCY * 


CONSISTENCY + 


1063 


38.9$ 


0.0$ 


19-6$ 


1074 


43 . 4$ 


1-50 


14.0$ 


1108 


26 . 8 % 


0,9% 


10.5$ 


1121 


44.8$ 


10.8$ 


24.2$ 


1149 


36.3$ 


6.8$ 


7-4$ 



^Defined as in this study. 

+Defined as the replication of the first two words in the 
verbal label. 



Procedure Used in Determining 
Concept Consistency 

The individual verbal labels recorded by each analyst 
for each article were then examined for match in concepts. 
They were arranged in concept categories based on synonymy 
using the mathematical concept of the fuzzy set, a set in 
which there are continuums of grades of membership. Zadeh 
discusses the fuzzy set as follows. 

More of ten than not, the classes of objects encoun- 
tered in the real physical world do not have precisely 
defined criteria cf membership. For example, the class 
of animals clearly includes dogs, horses, birds, etc. as 
its members, and clearly excludes such objects as rocks, 
fluids, plants, etc. However, such objects as starfish, 
bacteria, etc. have an ambiguous status with respect to 
the class of animals. The same kind of ambiguity arises 
in the case of a number such as 10 in relation to the 
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"class” of all real numbers which are much greater 

than 1 . 

Clearly* the "class of all real numbers which are 
much greater than 1*" or "the class of beautiful women , " 
or "the class of tall men*" do not constitute classes or 
sets in the usual mathematical sense of these terms. 

Yet* the fact remains that surh imprecisely defined 
"classes" play an important role in human thinking* 
particularly in the domains of pattern recognition* 
communication of information* and abstraction." 

Zadeh defines the fuzzy set as a class "with a continuum of 

grades of membership"^* and states that 

A fuzzy set provides a convenient point of departure 
for the construction of a conceptual framework which 
parallels in many respects the framework used in the 
case of ordinary sets* but is more general than the 
latter and potentially* may prove to have a much wider 
scope of applicability* particularly in the fields of 
pattern classification and infoi’mation processing. 
Essentially* such a framework provides a natural way of 
dealing with problems in which the source of imprecision 
is the absence of sharply defined criteria of class mem- 
bership rather than the presence of random variables.- 5 

Thus* the concept categories established for the verbal labels 

produced by the analysts for each article in this study were 

categories hospitable to synonyms* that is* 

A word having a meaning similar to that of another 
word in the same language. ... A word or expression 
accepted as a figurative or^symbolic substitute for 
another word or expression. 

They did not have to have identity of meaning* simply synonymy. 



1 L. A. Zadeh* "Fuzzy Sets*" Information and Control 
VIII (1965): 338-9. 

2 Ibid. 

3 Ibid. 

^The American Heritage Dictionary of the English Lan- 
guage* William Morris* ed. (New York: American Heritage 
Publishing Co.* Inc.* 1969 ) p. 1305. 








They constituted fuzzy sets. 

The process of the creation of the concept cate- 
gories was essentially a subjective one. Although, for 
many reasons, it was necessary for the investigator to cate- 
gorize the analyst verbal labels for most of the packets, it 
was possible, in two cases, to have the categorization of a 
packet done by someone other than the investigator. There- 
fore, although twenty of the packets were concept categorized 
by the investigator, two packets, one each, were concept 
categorized by two specially trained indexers. 

This was done to determine whether the pattern of 
the concept consistency scores derived from the categoriza- 
tions done by these indexers would differ greatly from the 
pattern of the scores derived from the categorizations done 
by the investigator. 

The two categorizers were each given a copy of in- 
structions (Appendix E) and were asked to categorize the 
verbal labels of two articles in the investigator's presence. 
They were then each given the data gathering sheets for one 
packet of articles and asked to categorize the verbal labels 
in them in accordance with the instructions. 

When the verbal labels for each article in a packet 
had all been assigned to concept categories, the category 
symbols for the appropriate categories were punched on the 
IBM cards that had already been punched with the verbal 
labels. Then the verbal labels and the concept categories 
to which they had been assigned, were manipulated and 
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printed out, category by category, by computer. 

Since a single verbal label often was placed in more 
than one concept category, the bulk cf the print-out of all 
categories for all articles mates reproduction here unfea- 
sible. Appendix F contains the categorized print-outs from 
ten of the articles. 

Definition of consistency in choice of concept 

Concept was defined as in Webster’s New World Dic- 
tionary of the American Language , (New York : World Publish- 
ing Company, c. i960) 302 : "an idea, especially a genera- 

lized idea of a class of objects; a thought; general 
notion”; and as defined in The American Heritage Dictionary 
of the English Language, (Boston: American Heritage Publi- 

shing Company, Inc., and Houghton Mifflin Company, c. 1969) 
275: ” 1 , A general idea or understanding, especially one 

derived from specific instances or occurrences. 2. A thought 
or notion. " 

Although it was relatively easy to establish a de- 
finition for consistency in choice of terminology, establi- 
shing a definition for consistency in perception of concept 
was more difficult. 

The word "concept" is defined in an abstruse, ab- 
stract, non-concrete way (witness the definitions given 
above). These definitions, therefore, maybe accurate, but 
they are not precise in their expression. This was one 
reason why the fuzzy set was chosen as the basis for the 
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establishment of the concept categories in this study. 



It 



was also one of the reasons why the concept categorization 

was done by more than one person. The results of the com- 
parison of the categorizations done by different categori- 
zers are discussed in Chapter V. 

It was not expected that precise replication of 
categorization by different investigators was likely to 
occur. However, the cross-test for this study indicates 
statistical reliability of the procedure at least sufficient 
for the immediate purposes of this study. Since the data 
are available, other investigators may test this aspect of 
the procedure, and the conclusions should be verifiable 
through replication of the experiment or only this part of 
it. 



Computation of the Quantitative Measurements 

Used in the Study 

Each packet of twenty-five articles was, as noted 
earlier, analyzed for indexable matter by five analysts. 

To arrive at a measure of inter-indexer consistency for 
every analyst in comparison with every other analyst for the 
packet, each analyst was paired with each of the other ana- 
lysts in turn. The pairs for each packet being, therefore: 
Analysts 1 and 2, Analysts 1 and 3, Analysts 1 and 4 , Ana- 
lysts 1 and 5, Analysts 2 and 3 , Analysts 2 and 4 , Analysts 
2 and 3 , Analysts 3 and 4 , Analysts 3 and Analysts 4 and 
5. For each packet of twenty-five articles, there were ten 



pairs of analysts. 



The quantitative measure used to arrive at a state- 
ment of indexer consistency for this study is based on the 
one described on page 117 of Saracevic and Goldwyn, 5 The 
formula they use is an follows: 

Number of terms in agreement 

Indexer consistency = 

Total Number of Unique Terms 

This formula, of course, reflects the definition of indexer 
consistency in which no distinction is made between indexer 
consistency in choice of terminology and indexer consistency 
in perception of indexable matter or concepts. 

The formulas used in the present study are directly 
based on the Saracevic and Goldwyn formula, but are modified / 

i 

to produce two separate measures of indexer consistency: 

\ 

consistency in choice of terminology and consistency in per- j 

ception of concept. JV 

Formula Used for Terminology Consistency Scores 
The inter-indexer consistency in choice of termino- 
logy for the concept labels chosen by each pair of analysts v 

in the group who analyzed each article for this study was 
calculated using the following formula. 



^Tefho Saracevic and A. J. Goldwyn, An Inquiry Into 
Te sting of Information Retrieval Systems. Part I: Objec- 

tives. Methodology. Design, and Controls (Cleveland. Ohio: 
Case Western Reserve University Center for Documentation 
and Communication Research, 1968). 
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Number of Verbal Labels 
Chosen by Both Analysts That 
Inter- Indexer Consistency Had Matching Terminology 

In Choice of Terminology = 

For Each Pair of Analysts Number of Unique Verbal La- 
bels Chosen by Both Analysts 

Then the arithmetic mean of the sum of the Consistency in 
Terminology Scores of all pairs of analysts was calculated 
and this became the stated measure of inter-analyst (inter- 
indexer) consistency in choice of terminology for the 
article. Appendix G contains examples of the tables de- ’ 
rived from the use of this formula and of the formula which 
follows. 



Formula Used for Concept Consistency Score_s 
The inter-indexer consistency in identification of 
concepts for each article for each pair of analysts were 
computed on the basis of the following formula, a modifica- 
tion of the formula used for computation of inter-indexer 
consistency in choice of terminology. 

Number of Synonymous Concepts 
Inter-Indexer Consistency Chosen by Both Analysts 

in Choice of Concept for = — — 

Each Pair of Analysts Total Number of Unique Con- 

cepts Chosen by Both Analysts 

Then the arithmetic mean of the sum of the consistency in 
choice of concept of all the pairs was calculated and this 
became the stated measure of inter-indexer consistency in 
choice of concept for the article. 

Percentages in both sets of calculations were com- 
puted to the second place to the right of the decimal point 
and rounded to the first. 




75 

76 



The two measures of consistency were then compared 
to test the hypothesis. 

Availability of Raw Data for Use by 

Other Investigators 

The methodology* raw data* and findings for this 
investigation will oe available from the investigator for 
a period of five years after its publication. Interested 
researchers may use this material either for their own 
purposes or to investigate the methodology and findings of 
this study itself. 

The study was designed to be replicable. In addi- 
tion* cross cnecks between packets of articles* all of which 
contained different articles and were analyzed by different 
combinations of indexers* reveal a pattern of results indi- 
cating that the differences found were of a gross nature 
and that a higher degree of precision in the definitions 
used (although desirable) was not a requirement for the 
determination of meaningful conclusions. It is to be 
hoped that the study itself may lead to means for the 
greater refinement of techniques for studies of this kind. 



CHAPTER IV 



CONCEPT CATEGORIZATION 

Physical Format Used to Display 
Analysts 1 Verbal Labels 

Belovj is a reproduction of the computer print-out 
of one of the verbal labels assigned by one of the analysts 
to the subject content of one of the articles in this 
study. All of the analyst labels were organized in this 
manner. 

i: 1075 OCA SALARIES FOR BEGINNING INFORMATION SCIENTISTS 

The print-out is divided into four fields. The first field 
contains the analyst* s identification number. The next 
contains the article identification number. The third 
field contains the alphabetic symbols for the concept 
categories assigned to this verbal label. The last field 
contains the actual words in the verbal label created by 
the analyst. In other words, this verbal label, SALARIES 
FOR BEGINNING INFORMATION SCIENTISTS, was created by 
analyst 11 for article 1075 and was seen by the categorizer 
to contain concepts from categories 0, C, and A 
(BEGINNING; SALARIES; AND INFORMATION SCIENTISTS). In the 
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complete categorization of the verbal labels for this 
article, the verbal label itself is printed under each of 
the three categories. 

Basis for the Establishment of Concept Categories 

All of the verbal labels assigned to the articles 
by the five analysts were categorized in a similar manner. 
The generalized context of the categorization was conceived 
of at* a type of coordinate index. Each of the articles was 
categorized without relation to the categories previously 
established for any other articles. Each verbal label was 
scanned individually, reduced to what were perceived as 
separate concepts and categorized according to these 
concepts . 

It is apparent that the categorizers ' perception and 
identification of the concepts chosen by the analysts was 
subjective. However, the goal was to isolate M every” 
concept in every label. These concepts were then assigned 
names, and each name represented one concept category. At 
no time did the categorizers read or refer to the actual 
article analyzed. 

Example and Explanation of the Categorization 
Process as Exhibited in the Analysis of 
the Analyst 1 s Verbal Labels for a 
Particular Article 

The particular article to which the previously 
reproduced analyst verbal label was assigned isi Theodore 
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C. Hines, "Salaries and Academic Training Programs for 
Information Scientists." Journal of Chemical Documentation 

VII (May 1967)1 118-20. The categorization of the verbal 

labels assigned by the analysts to the subject content of 
this article was quite straightforward. 

A step-by-step explanation of the method used in 
assigning concept categories to the verbal labels created 
for the article by the analysts is given on the next few 
pages. The categorization in its entirety is displayed 
following the explanation. 

All of the verbal labels created by the five analysts 
for article 1075 were keypunched individually on IBM cards 
exactly as written by the analysts. They were then printed ^ 

out by computer, analyst by analyst. This print-out was 
read by the categorizer for the purpose of assigning concept 
categories. 

— Ur 

The first verbal label on the print-out for 
article 1075 was INFORMATION SCIENTISTS - TRAINING. The 
categorizer perceived this label as containing the concepts 
INFORMATION SCIENTISTS and TRAINING. These concepts were 9 

therefore arbitrarily assigned the category labels 1075A 
INFORMATION SCIENTISTS and 1075B TRAINING. The other verbal 
labels created for this article were then scanned. If any 
of them contained the concept INFORMATION SCIENTIST, 
category A was assigned to that label. If it did not 
contain the concept INFORMATION SCIENTIST, the category A 
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was not assigned to it. The actual words "information 
scientist” did not have to appear in the verbal label for it 
to be assigned category A. For instance, the verbal label 
PERSONNEL, INFO, SCI. was assigned to category 1075A, Like- 
wise, the verbal label LIBRARY SCHOOLS - CURRICULUM FOR 
INFORMATION SCIENCE was assigned to category 1075B even 
though the actual word "training" does not appear in the 
label. When all of the verbal labels that contained the 
concepts INFORMATION SCIENTISTS and TRAINING had been 
assigned the proper alphabetic symbol, the second verbal 
label on the print-out was read. Let us suppose that this 
second verbal label was INFORMATION SCIENTISTS - SALARIES. 
The concept INFORMATION SCIENTISTS had already received a 
category . abel and alphabetic symbol. It was therefore not 
considered again. The only new concept in this verbal 
label is SALARIES. The concept category 1075C SALARIES 
was therefore established and each succeeding verbal label 
on the print-out was scanned for the concept SALARIES. 

When a verbal label was found to contain the concept 
SALARIES, it was assigned the category symbol C, 

This procedure was continued until all the concepts 
contained in all the verbal labels created for article 1075 
had been assigned symbols and each verbal label had been 
searched for each concept. 

The alphabetic symbols assigned to each verbal 
label were then keypunched on the IBM ca J ready punched 
with the verbal label. These cards and a categorization 
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program deck of cards were then put through the computer 
and the resulting print-out listed category labels for each 
article and the verbal labels assigned to each category on 
the succeeding pages. 

The alphabetic (or in some cases punctuation mark) 
symbols assigned to differentiate the categories from one 
anothex’ do not indicate any relationship between the concepts 
established for a single article or between the concepts 
established for different articles. They were simply 
assigned one after the other in no particular meaningful 
way, beginning arbitrarily with the letter A for the first 

concept identified in a particular article’s verbal labels. 

\ 

The order in which the concept category labels were assigned 
alphabetical symbols was influenced only by the order of the 
verbal labels in the print-out, and although the verbal 
labels for each article were grouped by analyst, the order 
in which the verbal labels appeared in each analyst-grouping 
was dictated only by the order in which the verbal labels 
had been keypunched. 

PRINT-OUT OF CATEGORIZATION OF ARTICLE NO. 1075 



o 

ERIC 



A. INFORMATION SCIENTISTS 

INFORMATION SCIENTISTS -TRAINING 
INFORMATION SCIENTISTS -SALARIES 
INFORMATION SCIENTISTS -AVAILABILITY 
(I.E. NUMBER) 

LIBRARY SCHOOLS-TRAINING OF INFOR- 
MATION SCIENTISTS 
INFORMATION SCIENTISTS -ADVANCED 
POSITIONS -SALARIES 
INFORMATION SCIENTISTS -RECRUITMENT 
INFORMATION SCIENTISTS -SALARIES - 
COMPARED TO CHEMISTS’ SALARIES 



6 


1075 


BA 


6 


1075 


CA 


6 


1075 


DA 


6 


1075 


GBA 


6 


1075 


HCA 


6 


1075 


IA 


6 


1075 


JCA 
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5 


1075 


IA 


RECRUITMENT- INFORMATION SCIENTISTS 


5 


1075 


NBA 


SCIENCE TRAINING REQUIREMENT- 

INFORMATION SCIENTISTS 


11 


in 

C-- 

0 
»— 1 


OCA 


SALARIES FOR BEGINNING INFORMATION 
SCIENTISTS 


11 


1075 


KBA 


ACADEMIC TRAINING PROGRAMS FOR INFOR- 
MATION SCIENTISTS 


11 


1075 


IA 


RECRUITING INFORMATION SCIENTISTS 


2 


1075 


A 


PERSONNEL* INFO. SCI. 


13 1073 
TRAINING 


KBA 


INFORMATION SCIENTISTS* ACADEMIC 
TRAINING PROGRAMS 


6 


1075 


BA 


INFORMATION SCIENTISTS -TRAINING 


6 


1075 


GBAK 


LIBRARY SCHOOLS -TRAINING OF INFORMATION 
SCIENTISTS 


5 


1075 


KPB 


INFORMATION SCIENCE-ACADEMIC TRAINING 
PROGRAMS 


5 


1075 


KGFB 


LIBRARY SCHOOLS- CURRICULUM FOR 
INFORMATION SCIENCE 


5 


1075 


NBA 


SCIENCE TRAINING REQUIREMENT-INFOR- 
MATION SCIENTISTS 


11 


in 

b 

1 — 1 


KBA 


ACADEMIC TRAINING PROGRAMS FOR 
INFORMATION SCIENTISTS 


2 


1075 


B 


TRAINING 


13 


1075 


KBA 


INFORMATION SCIENTISTS* ACADEMIC 
TRAINING PROGRAMS 


13 1075 

SALARIES 


ZKGFB 


LIBRARY SCHOOLS OFFERING INFORMATION 
SCIENCE COURSES IN 19 66 


6 


1075 


CA 


INFORMATION SCIENTISTS -SALARIES 


6 


1075 


HCA 


INFORMATION SCIENTISTS -ADVANCED 
POSITIONS-SALARIES 


6 


1075 


JCA 


INFORMATION SCIENTI STS- SALARIES - 
COMPARED TO CHEMISTS 1 SALARIES 


5 


1075 


FC 


INFORMATION SCIENCE-SALARIES 


11 


1075 


OCA 


SALARIES FOR BEGINNING INFORMATION 
SCIENTISTS 


2 


1075 


C 


SALARIES 


13 


1075 


FC 


INFORMATION SCIENCE* SALARIES IN 
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D. Ti UMBER OF INFORMATION SCIENTISTS AVAILABLE; PROFESSIONAL 

PERSONNEL POOL 



6 


1075 


DA 


INFORMATION SCIENTISTS -AVAILABILITY 
(I.E. NUMBER) 


11 


1075 


D 


PROFESSIONAL PERSONNEL SHORTAGES 


STUDENT 


SUPPORT ; 


FINANCIAL AID; SCHOLARSHIPS 


6 


1075 


FE 


INFORMATION SCIENCE STUDENTS -SUPPORT 


5 


1075 


GE 


LIBRARY SCHOOLS-STUDENT AID 


11 


1075 


FE 


LEVEL OF SUPPORT FOR INFORMATION 
SCIENCE STUDENTS 


2 


1075 


E 


SCHOLARSHIPS 


13 


1075 


FE 


INFORMATION SCIENCE, FELLOWSHIPS 


INFORMATION SCIENCE 


6 


1075 


FE 


INFORMATION SCIENCE STUDENTS -SUPPORT 


5 


1075 


FC 


INFORMATION SCIENCE-SALARIES 


5 


1075 


KFB 


INFORMATION SCIENCE-ACADEMIC TRAINING 
PROGRAMS 


5 


1075 


KGFB 


LIBRARY SCHOOLS-CURRICULUM FOR 
INFORMATION SCIENCE 


11 


1075 


FE 


LEVEL OF SUPPORT FOR INFORMATION 
SCIENCE STUDENTS 


2 


1075 


F 


INFO. SCI. 


13 


1075 


FC 


INFORMATION SCIENCE, SALARIES IN 


13 


1075 


ZKGFB 


LIBRARY SCHOOLS OFFERING INFORMATION 
SCIENCE COURSES IN 19 66 


13 


1075 


FE 


INFORMATION SCIENCE, FELLOWSHIPS 


LIBRARY 


SCHOOLS 




6 


1075 


GBAK 


LIBRARY SCHOOLS -TRAINING OF INFOR- 
MATION SCIENTISTS 


5 


1075 


LG 


LIBRARY SCHOOLS -ADMISSION REQUIREMENTS 


5 


1075 


MG 


LIBRARY SCHOOLS-FINANCIAL SUPPORT 


5 


1075 


GE 


LIBRARY SCHOOLS-STUDENT AID 


5 


1075 


KGFB 


LIBRARY SCHOOLS-CURRICULUM FOR INFOR- 
MATION SCIENCE 


2 


1075 


G 


LIBRARY SCHOOLS 


13 


1075 


ZKGFB 


LIBRARY SCHOOLS OFFERING INFORMATION 
SCIENCE COURSES IN 1966 




83 

84 
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H. ADVANCED POSITIONS 

6 1075 HCA INFORMATION SCIENTISTS -ADVANCED 

POSITI ONS- SALARIES 



RECRUITMENT 
6 1075 IA 


5 


1075 


IA 


11 


1075 


IA 


2 


1075 


I 


CHEMIST' 
6 1075 


JCA 



INFORMATION SCIENTISTS -RECRUITMENT 
RECRUITMENT- INFORMATION SCIENTISTS 
RECRUITING INFORMATION SCIENTISTS 
RECRUITING 

INFORMATION SCIENTISTS -SALARIES- 
COMPARED TO CHEMISTS’ SALARIES 



K. ACADEMIC TRAINING PROGRAMS; CURRICULUM 



6 


1075 


GBAK 


LIBRARY SCHOOLS-TRAINING OF INFOR- 
MATION SCIENTISTS 


5 


1075 


KFB 


INFORMATION SCIENCE-ACADEMIC TRAINING 
PROGRAMS 


5 


1075 


KGFB 


LIBRARY SCHOOLS-CURRICULUM FOR INFOR- 
MATION SCIENCE 


11 


in 

o 

i — i 


KBA 


ACADEMIC TRAINING PROGRAMS FOR INFOR- 
MATION SCIENTISTS 


2 


1075 


K 


CURRICULUM 


13 


1075 


KBA 


INFORMATION SCIENTISTS* ACADEMIC 
TRAINING PROGRAMS 


13 


1075 


ZKGFB 


LIBRARY SCHOOLS OFFERING INFORMATION 
SCIENCE COURSES IN 1966 



L. ADMISSION REQUIREMENTS 

5 1075 LG LIBRARY SCHOOLS-ADMISSION REQUIREMENTS 

M. FINANCIAL SUPPORT OF LIBRARY SCHOOLS 

5 1075 MG LIBRARY SCHOOLS -FINANCIAL SUPPORT 



N. TRAINING IN SCIENCE 

5 1075 NBA SCIENCE TRAINING REQUIREMENT- INFOR- 

MATION SCIENTISTS 

O. BEGINNING POSITIONS 

11 1075 OCA SALARIES FOR BEGINNING INFORMATION 

SCIENTISTS 



P. COS ATI REPORT 

2 1075 P COSATI REPORT 
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Q. CHEMICAL AND ENGINEERING NEWS 

2 1075 Q CHEMICAL & ENGINEERING NEWS 

R. CHEMISTRY 

2 1075 R CHEMISTRY 

S . WOMEN 

2 1075 S WOMEN 

T. NATIONAL RESEARCH CENTER 

2 1075 T NAT. RESEARCH CENTER 

U. HIGHER EDUCATION ACT 

2 1075 U HIGHER ED. ACT 

V. AMERICAN DOCUMENTATION 

2 1075 V AMER. DOCUMENTATION 

W. STATISTICS 

2 1075 W STATISTICS 

X. INDUSTRY 

2 3075 X INDUSTRY 

Y. CHEMICAL ABSTRACTS 

2 1075 Y CHEMICAL ABSTRACTS 

Z . 1966 

13 1075 ZKGFB LIBRARY SCHOOLS OFFERING INFORMATION 

SCIENCE COURSES IN 3-966 



Concept categories 1075 A* B, C, E, F, and K were 
all chosen by all five analysts. That is, al3. the analysts 
created verbal labels that embodied the concepts INFORMATION 
SCIENTISTS, TRAINING, SALARIES, STUDENT FINANCIAL AID, 
INFORMATION SCIENCE, ACADEMIC TRAINING PROGRAMS. This 
article was obviously about the salaries and academic 
training of information scientists. 

Category G, LIBRARY SCHOOLS, was identified as a 
subject concept by four of the analysts, as was Category I, 
RECRUITMENT . 





In addition to the above concept categories* however* 
a number of other subject concepts were Identified by one 
or more of the analysts. These included Categories D* THE 
NUMBER OF INFORMATION SCIENTISTS AVAILABLE; J* CHEMISTS; 

L* ADMISSION REQUIREMENTS; etc. The concepts identified by 
less than four of the five analysts probably represent 
peripheral areas touched on by the author. Obviously* some 
analysts believed indexable information on them was con- 
tained in the article — some did not. The analysts 1 
perception of concepts as indexable or non-indexable matter 
varied. What was of prime interest for this study* of 
course* was whether they varied to a greater or lesser 
extent than the terminology used by each analyst to describe 
the concepts he chose to record. The statistics on the 
percent of indexer consistency in choice of concept and in 
choice of terminology for article 1075 are presented in 
Table IV - 1. 

Inter-indexer consistency in perception of concept 
for each pair of analysts ranged from a low of 35.0$ to 
a high of 66.6$. The mean concept consistency for all 
pairs of analysts was 49.6$, None of the verbal labels 
created by the analysts matched those of any other analyst. 
Terminology consistency was therefore 0.0$. 

As stated before* all the analysts created verbal 
labels that embodied the concepts INFORMATION SCIENTISTS* 
TRAINING* SALARIES* STUDENT FINANCIAL AID* INFORMATION 
SCIENCE* and ACADEMIC TRAINING PROGRAMS. The title of the 



TABLE IV - 1 



PERCENTAGES OF INTER-ANALYST CONSISTENCY IN CHOICE OF 
CONCEPTS AND IN CHOICE OF TERMINOLOGY 
FOR ARTICLE 1075 



PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


TERMIN- 

OLOGY 

CONSIS- 

TENCY 


ARITHMETIC 
MEAN OF 
CONCEPT CON- 
SISTENCY OF 
ALL PAIRS 


ARITHMETIC 
MEAN OF 
TERMINOLOGY 
CONSISTENCY 
OF ALL PAIRS 


6 and 5 


57.1# 


0. 0$ 






6 and 11 


66.7# 


0.0$ 






6 and 2 


38.1# 


0.0$ 






6 and 13 


58.3# 


0.0$ 






5 and 11 


53 . 8# 


0.0$ 






5 and 2 


38.1# 


0.0$ 






5 and 13 


58.3# 


0. 0$ 






11 and 2 


35.0# 


0.0$ 






11 and 13 


54.5# 


0.0$ 






2 and 13 


36 . 8 # 


0.0$ 


49.7$ 


0.0# 
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article: "Salaries and Academic Training Programs for 

Information Scientists", contains all but one of these con- 
cepts and a KWIC index of the title would provide many 
appropriate words as access points. 

At least four of the five analysts also created verbal 
labels for the concepts LIBRARY SCHOOLS and RECRUITMENT. 

These access points do not occur in the title. The degree 
of influence exerted by the title on the analysts T choice 
of concepts has not been investigated for this study, but 
might prove a worthwhile area to explore. There have been 
studies that compare indexers choice of terms for a given 
text with the words chosen from the title of the text for a 
KWIC or a KWOC index. 

In a concept category like 10751* RECRUITMENT, the 
four verbal labels listed could be regularized for terminol- 
ogy easily by human manipulation, or even by a computer 
program using semantic cr morphological rules for standardi- 
zation. These verbal labels were: 

6 1075 I A INFORMATION SCIENTISTS -RECRUITMENT 

5 1075 IA RECRUITMENT-INFORMATION SCIENTISTS 

11 1075 IA RECRUITING INFORMATION SCIENTISTS 

2 1075 I RECRUITING 

As they stand now, they are not a match in terminology. 

Categories 1075B, TRAINING and 1075K, ACADEMIC 
TRAINING PROGRAMS, might have been combined into one concept 
category except for the verbal label TRAINING created by 
analyst 2. There are other kinds of training besides 
academic training, a fact analyst 2 was surely aware of. 
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However, in the ena result, we only have access to the 
terms the indexer actually recorded, therefore his use of 
the word by itself must be assumed to reflect his perception 
of the concept he believed was embodied in the article. To 
assume that he meant academic training would be to augment 
his verbal label. Therefore this was not done and two 
categories had to be established to encompass the two con- 
cepts. The fact that analyst 2 also chose the verbal label 
CURRICULUM, a word that refers to academic training, and 
was therefore included under category 1075K, does not alter 
this. 



Types of Concept Categories 
In the categorization of article 1075 on the previous 
pages and in the print-out of article categorizations in 
Appendix F, some of the concept categories can be seen to 
contain two or more concepts or a concept and a modifier. 
These categories were established because the categorizer 
felt that a multiple concept category would be more useful 
for the particular article than establishing two separate 
categories . 

There were also articles in which a separate category 
for a single concept was established and a multiple concept 
category was also established that included the separate 
concept, e.g. LIBRARIES; PUBLIC LIBRARIES; URBAN PUBLIC 
LIBRARIES . 
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Some categories contain only a concept that may be 
referred to as a standard modifier or subdivision (a concept 
which has the ability to narrow and/or modify the scope of 
another concept, i.e., ADVANTAGES; METHODS). The cases 
in which these were established as separate categories were 
cases in which the categorizer perceived them as the focus 
of the analysts’ labels, i.e. when it seemed that METHODOLOGY 
or EVALUATION was the central concern. Standard modifiers 
were established as separate categories also in cases where, 
within a single article, many different categories would 
have contained different standard modifiers or repeated a 
single standard modifier. 

In articles where these standard modifiers formed 
part of a multiple concept category, it was the categorizers ' 
judgement that this was the most appropriate way to treat 
the concept(s) and that, in a sense, the multiple concept 
category established was similar to a bound term, i.e. 

NEWARK CHARGING SYSTEM, ADVANTAGES; not NEWARK CHARGING 
SYSTEM and ADVANTAGES. 

For the purposes of the categorization, names of 
organizations, journals, etc., were treated as single 
concepts and not broken into the concepts ordinarily signi- 
fied by the individual words in their titles, e.g., the 
verbal label CHEMICAL ABSTRACTS which refers to the title 
of a journal, appears under the category 1075Y, CHEMICAL 
ABSTRACTS, but not under category 1075R, CHEMISTRY. 
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After the categorization process was completed, the 
formulas described in Chapter III were used to analyze the 
data • 



Analysts T Verbal Labels 

Although most of the analysts usually employed verbal 
labels containing two or more words , some seldom used more 
than two words for a verbal label unless the label were the 
name of an organization, publication, or similar previously 
established multi-word grouping. However, it was noted 
that those analysts whose individual verbal labels con- 
tained few words created a greater number of individual 
verbal labels for a given article. Examples of this are 
displayed in Table IV - 2. Analyst GK, who created a 
relatively large number of verbal labels for each article, 
consistently used only one or two words per label. Other 
analysts created fewer labels for each article, but used 
more words per label. * 



TABLE IV - 2 



NUMBER 



ARTICLE 

NUMBER 



1127 



1116 



1089 



1068 



o 

ERLC 



OP VERBAL LABELS FER ARTICLE IN COMPARISON 
TO NUMBER OF WORDS PER VERBAL LABEL 





NUMBER 


AVERAGE NUMBER 


ANALYST 


OP 


OF WORDS PER 




VERBAL 


LABEL 




LABELS 





GK 


10 


1.7 


EP 


4 


6.5 


LB 


8 


4.8 


BC 


3 


7.6 


JY 


4 


3.2 


GK 


9 


1.8 


EP 


4 


5.5 


LB 


4 


6.7 


BC 


7 


4.1 


JY 


7 


2.4 


GK 


15 


1.9 


EP 


5 


7.4 


LB 


1 


7.0 


BC 


6 


6.7 


JY 


2 


7.0 


GK 


17 


1.8 


EP 


1 


7.0 


LB 


1 


8.0 


BC 


6 


4.5 


JY 


9 


5.2 
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Categorjzers T Evaluation of Analysts * 

Verbal Labels 

The categorizers attempted to evaluate the analysts 
verbal labels exactly as written. This is the reasoning 
behind category 1075M* FINANCIAL SUPPORT OF LIBRARY SCHOOLS . 
Analyst 5 wrote the verbal label LIBRARY SCHOOLS - FINANCIAL 
SUPPORT. Although one might suppose from other analysts* 
labels for article 1075 that she meant financial support of 
students* she had written a label consistent with the concept 
of financial support of library schools. She also had 
written the label LIBRARY SCHOOLS - STUDENT AID. This does 
not exclude the possibility that the first-mentioned label 
meant support of students since many analysts wrote more 
than one label encompassing the same concepts. A perusal of 
the article itself would have solved this problem since the 
categorizer could have ascertained whether or not the author 
had included information on the financial support of library 
schools. The point of this study* however* is to categorize 
the analysts’ perceptions as recorded in the verbal labels 
they created. Therefore the verbal label was taken at face 
value and a separate category was created for it. 

In cases similar to the above* where the categorizer 
had doubts about the actual meaning of a word in a label* a 
standard dictionary was used to provide definitions. 

The use of a dictionary in establishing concept 
categories was of real importance in cases where words are 
customarily used imprecisely. It is* of course* reasonable 
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to suppose that the analysts themselves were not always 
careful in their use of overlapping or ambiguous words. 

A case in point is use of the terms M automation” , 
"mechanization", and "computerization". The dictionary 
defines automation as the automatic operation or control of 
machines or processes; and mechanization as the use or 
introduction of machines into processes, but also, as the 
process of making something automatic. These words have 
great overlap in meaning and, in most of the categorizations 
in this study, were used as empirically synonymous. The 
word "computerization" was distinguished from automation or 
mechanization since it was perceived as referring only to 
the use or introduction of computers. The fuzzy set bearing 
the name "automation" or "mechanization" might include 
computerization, but it might not. "Computerization" would 
always include the concept of mechanization. • (To use a 
device which is primarily electronic, not mechanical, is 
still to "mechanize".) 

There are certain types of analyst verbal labels that 
name their own concepts. For instance, CARLOS CUADRA 
remains Carlos Cuadra in name and in concept. Although 
philosophers may argue that CARLOS CUADRA, 1947 is not 
CARLOS CUADRA, 1967; the concept CARLOS CUADRA names itself 
in a concept categorization. 

This is true of other kinds of names. Names of 
organizations, for example, like the International Union of 
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Pure and Applied Chemistry; names of places, like Canada; 
names of things, like books. 

The categories established for this study are concept 
categories. In many cases, the categories may appear to be 
word-based, rather than concept-based, because the actual 
words in the analyst verbal labels match the words in the 
category name. When this has occurred, it is because it is 
an instance in which the concept named itself. 

Hierarchal Expansion in Verbal Labels 

In addition to the problems encountered in the 
concept categorization, a reading of the verbal labels 
created by the analysts for some of the articles reveals 
certain problems which affected the outcome of this study 
that each analyst had to resolve for himself. One of the 
major problems was caused by the lack of guidelines as to 
desired hierarchal treatment. 

It is apparent from some of the verbal labels that 
for some articles, some of the analysts decided to classify 
concepts, that is, to group them under a generically higher, 
inclusive "class" term, rather than to list each concept 
separately at a lower generic level. For example, in an 
article on the work of the committees of the Special Libraries 
Association, some of the analysts listed each committee, 
others classed the information under the verbal label SPECIAL 
LIBRARIES ASSOCIATION, COMMITTEES. 
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It is not now possible to discover what their 
reasoning was on this pointy and it was not the intent of 
this study to do so. There are many possible reasons ranging 
from a desire to complete the work quickly to the possibility 
that the concept they perceived was the class concept, to 
the possibility that they may have felt that the generically 
higher (class) term was the more useful in the context of 
the analysis. 

This problem of choice of higher generic terms in 
contrast to lower generic terms is apparent in the analysis 
of articles 115-1 and 1233. 

In 1151* all the analysts chose the concept NEW 
ENGLAND STATE UNIVERSITIES’ LIBRARIES. Each analyst had to 
make an individual decision as to whether the name of each 
separate university should also be identified as a subject 
concept. Only one of them chose to do so. This analyst 
identified six universities by name and also chose to use the 
verbal label ACADEMIC LIBRARIES. This had an effect on the 
consistency statistics for this article. Table IV - 3 
contains the statistics on consistency for the article as it 
was analyzed and categorized. Table IV - 4 contains the 
statistics on consistency that would have resulted if 
analyst 11 had chosen not to create verbal labels for the 
names of the six universities and ACADEMIC LIBRARIES. 

Of course, the terminology consistency changes very 
little in the following two tables since only one analyst chose 
the universities' names and the label ACADEMIC LIBRARIES. 
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TABLE IV 
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PERCENTAGES OF INTER-INDEXER CONSISTENCY IN CHOICE OF 
CONCEPTS AND IN CHOICE OF TERMINOLOGY 
FOR ARTICLE NO. 1151 



PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


TERMIN- 

OLOGY 

CONSIS- 

TENCY 


ARITHMETIC 
MEAN OF 
CONCEPT CON- 
SISTENCY OF 
ALL PAIRS 


ARITHMETIC 
MEAN OF 
TERMINOLOGY 
CONSISTENCY 
OF ALL PAIRS 


6 and 5 


35.7$ 


0 . 0 % 






6 and 11 


18 . 2 $ 


0 . 0 % 






6 and 2 


18 . 2 $ 


0 . 0 % 






6 and 13 


33.3$ 


9.1$ 






5 and 11 


21.1$ 


0.0$ 






5 and 2 


15.0$ 


0.0$ 


* 




5 and 13 


54.5$ 


0.0$ 






11 and 2 


25.0$ 


3.7$ 






11 and 13 


26.3$ 


0.0$ 






2 and 13 


20.0$ 


0.0$ 


26.7$ 


1.30 
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TABLE IV - 4 

PERCENTAGES OP INTER-INDEXER CONSISTENCY IN CHOICE OP 
CONCEPT AND IN CHOICE OF TERMINOLOGY FOR ARTICLE 
NO. 1151 WITH MODIFICATION OF ANALYST 11 

VERBAL LABELS 



PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


TERMIN- 

OLOGY 

CONSIS- 

TENCY 


ARITHMETIC 
MEAN OF 
CONCEPT CON- 
SISTENCY OF 
ALL PAIRS 


ARITHMETIC 
MEAN OF 
TERMINOLOGY 
CONSISTENCY 
OF ALL PAIRS 


6 and 5 


35.7# 


O.C$ 






6 and 11 


26.8# 


0.0$ 






6 and 2 


18.2# 


0.0$ 






6 ana 13 


33.3$ 


9.10 






5 and 11 


33.3$ 


0.0$ 






5 and 2 


15.0# 


0.0$ 






5 and 13 


54.5$ 


0.0$ 






11 and 2 


27 . 5 % 


4.50 






11 and 13 


50 . 0 % 


0.0$ 






2 and 13 


20 . 0 % 


0.0$ 


31 . 4$ 


1.4$ 



£8 
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The mean of the concept consistency for the article is 
changed to a greater degree. 

This same problem arose in connection with the 
analyst verbal labels for article 1233^ tut in this case, 
both the concept consistency percentage and the terminology 
consistency percentage would be changed appreciably if some 
of the analysts chose not to use narrow as well as broad 
concepts . 

All of the analysts had chosen the subject concept 
LINGUISTICS for article 1233. The problem the analysts 
faced was whether or not the individual languages discussed 
in the article should be identified as subject concepts. 

The particular categories and analyst verbal labels involved 
are as follows. 

PART OF CATEGORIZATION OF ARTICLE NO. 1233 



LINGUISTICS 




6 


1233 


CBA 


BIBLIOGRAPHIC CONTROL OF LINGUISTIC 
SCHOLARSHIP 


5 


1233 


EDB 


LINGUISTIC BI BLI OGRA PHY - 1 NDO- 
EUROPEAN LANGUAGES 


5 


1233 


TEB 


LINGUISTIC BIBLIOGRAPHY-BIBLIO- 
GRAPHIC ESSAY 


5 


1233 


VB 


LINGUISTICS- ABSTRACTING SERVICES 


5 


1233 


YWB 


LINGUISTICS-SUBJECT INDEXES 


5 


1233 


?YXB 


LINGUISTICS-COMPUTERIZED INDEXES 


5 


1233 


YB 


LINGUISTICS -CUMULATIVE INDEXES 


11 


1233 


DCBA 


BIBLIOGRAPHIC CONTROL OF LINGUISTIC 
SCHOLARSHIP IN INDO-EUROPEAN 
LANGUAGES 


2 


1233 


B 


LINGUISTICS 


2 


1233 


:B 


HISTORY OF LANGUAGE 



100 



13 


1333 


EB 


13 


1233 


?EB 


INDC 


)- EURO PE AN 


6 


1233 


ED 


5 


1233 


EDB 


11 


1233 


DCBA 


2 


1233 


D 



BIBLIOGRAPHY-, LINGUISTIC 
COMPUTER RETRIEVAL, PROPOSED FOR 
LINGUISTIC BIBLIOGRAPHY 



INDO-EUROPEAN LANGUAGES -BIBLIOGRAPHY 

LINGUISTIC BIBLI OGRAPHY-INDO- 
EUROPEAN LANGUAGES 

BIBLIOGRAPHICAL CONTROL OF 

LINGUISTIC SCHOLARSHIP IN INDO- 
EUROPEAN LANGUAGES 

INDO-EUROPEAN LANGUAGES 



H . GREEK 

6 1233 HGE 

6 1233 JIHGE 



CLASSICAL GREEK- BIBLIOG. 

CLASSICAL STUDIES (GREEK & ROMAN) - 
BIBLIOG. -DOCTORAL DISSERTATIONS 



I. 



2 1233 H 

LATIN; ROMAN 



GREEK LANGUAGE 





6 

6 


1233 

1233 


IE 

JIHGE 


LATIN-BIBLIOG. 

CLASSICAL STUDIES (GREEK & ROMAN) - 
BIBLIOG. -DOCTORAL DISSERTATIONS 




2 


1233 


I 


LATIN LANGUAGE 


M. 


GERMANIC 
6 1233 


ME 


GERMANIC LANGUAGES-BIBLIOG. 




2 


1233 


M 


GERMANIC LANGUAGES 




13 


1233 


ME 


GERMANIC LANGUAGES, BIBLIOGRAPHY 


N. 


SCANDINAVIAN 
6 1233 NE 


SCANDINAVIAN LANGUAGES-BIBLIOG. 




2 


1233 


N 


SCANDINAVIAN LANGUAGES 




13 


1233 


NE 


SCANDINAVIAN LANGUAGES, BIBLIOGRAPHY 


0 . 


ENGLISH 
6 1233 


OE 


ENGLISH LANGUAGE-BIBLIOG. 




13 


1233 


OE 


ENGLISH LANGUAGE, BIBLIOGRAPHY 






o 

ERIC 



100 

101 



P . ROMANCE LANGUAGES 





6 


1233 


PE 


ROMANCE 


LANGUAGES -BIBLIOG . 




2 


1233 


P 


ROMANCE 


LANGUAGES 




13 


1233 


PE 


ROMANCE 


LANGUAGES, BIBLIOGRAPHY 


Q. 


CELTIC 

6 1233 


QE 


CELTIC 


LANGUAGES-BIBLIOG. 




2 


1233 


Q 


CELTIC 


LANGUAGES 




13 


1233 


QE 


CELTIC 


LANGUAGES, BIBLIOGRAPHY 


R. 


SLAVIC 
6 1233 


RE 


SLAVIC 


LANGUAGES -BIBLIOG . 




2 


1233 


R 


SLAVIC 


LANGUAGES 




13 


1233 


RE 


SLAVIC 


LANGUAGES, BIBLIOGRAPHY 


S. 


INDIAN 
6 1233 


SE 


INDIAN (EAST) LANGUAGES -BIBLIOG. 




2 


1233 


S 


INDIAN 


LANGUAGES 






As can 


be seen. 


although 


all analysts chose verbal 



labels that could be categorized as containing the concept 
LINGUISTICS, the analysts varied in their perception of 
individual languages or families of languages as subject 
concepts. Four chose concepts contained in category 1233D, 
INDO-EUROPEAN; three chose concepts contained in categories 
1233M, GERMANIC; 1233N, SCANDINAVIAN 5 1233P, ROMANCE; 

1233Q, CELTIC; 1233R, SLAVIC; and two chose concepts con- 
tained in categories 1233H, GREEK; 12331, LATIN; ROMAN; 
12330, ENGLISH; and 1233S, INDIAN. 

Some of the verbal labels in the above categories 
matched in terminology as well as in concept. In the case 
of article 1233, therefore, both the concept consistency and 
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TABLE IV - 5 



i 

I 

«* 

* 

PERCENTAGES OF INTER-INDEXER CONSISTENCY IN CHOICE OF ? 

CONCEPT AND IN CHOICE OF TERMINOLOGY * 

FOR ARTICLE NO. 1233 



PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


TERMIN- 

OLOGY 

CONSIS- 

TENCY 


ARITHMETIC 
MEAN OF 
CONCEPT CON- 
SISTENCY OF 
ALL PAIRS 


ARITHMETIC 
MEAN OF 
TERMINOLOGY 
CONSISTENCY 
OF ALL PAIRS 


6 and 5 


17.90 


4.20 






6 and 11 


38.10 


5.30 






6 and 2 


56.50 


4.80 






6 and 13 


50.00 


35.00 






3 and 11 


33.30 


7.10 






5 and 2 


29.20 


6.20 






5 and 13 


16.70 


4.80 






11 and 2 


28.60 


9.10 






11 and 13 


26.30 


6.20 






2 and 13 


40.90 


5.50 


33.8$ 


8.8$ 



0 
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TABLE IV - 6 



PERCENTAGES OP INTER-INDEXER CONSISTENCY IN CHOICE OP 
CONCEPT AND IN CHOICE OF TERMINOLOGY FOR ARTICLE 
NO. 1233 WITH MODIFICATION OF VERBAL LABELS 



PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


TERMIN- 

OLOGY 

CONSIS- 

TENCY 


ARITHMETIC 
MEAN OF 
CONCEPT CON- 
SISTENCY OF 
ALL PAIRS 


ARITHMETIC 
MEAN OF 
TERMINOLOGY 
CONSISTENCY 
OF ALL PAIRS 


6 and 5 


26.5$ 


4.2$ 






6 and 11 


66.6$ 


11.1$ 






6 and 2 


35.5$ 


4.8$ 






6 and 13 


38.5$ 


12.6$ 






5 and 11 


33.3$ 


7.1$ 






5 and 2 


43.4$ 


6.2$ 






5 and 13 


22.2$ 


4.8$ 






11 and 2 


46.2$ 


9.1$ 






11 and 13 


38.5$ 


13 . 4$ 






2 and 13 


30 . 8 $ 


5.5$ 


38.1# 


7 .9$ 
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the terminology consistency were affected by the analysts’ 
choice of broad or narrow terms. 

Table IV - 5 contains the statistical results for the 
article as it was actually analyzed and categorized. 

Table IV - 6 contains the statistics that would have re- 
sulted if the analysts had chosen not to create labels for 
the names of the individual languages. 

Problems of Classification and Indexing 
as Reflected in the Verbal Labels 

The problem involved in choice of higher or lower 
generic concepts as in article 1233, just discussed, is 
comparable to a problem apparent in the analysis of some of 
the other articles. This problem may imprecisely be called 
the difference between classification and indexing. This 
does not mean the difference between levels of indexing 
(often referred to as indexing specificity) and classi- 
fication. ’’Classification is, in its simplest statement, 
the putting together of similar things, or, more fully 
described, it is the arranging of things according to like- 
ness and unlikeness . 

In classification, a group of items with character- 
istics that could be more precisely defined, are assigned to 

^Margaret Mann, Introduction to Cataloging and the 
Classification of Books , 2nd ed . (Chicago: American Library 

Association, 19 ^ 3 )* P* 33 » 
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When an item is assigned to a 



more general categories, 
class of items, it is an indication that there is a relation- 
ship between the individual item and the other items in the 
group. 

Indexing is the characterization of various concepts 
in an item, or the item itself as a whole, so as to dis- 
tinguish the concept or item from a mass of similar concepts 
or items and thus provide access to the concept or item. The 
various levels of indexing refer to the narrowness or broad- 
ness cf the concepts to be characterized. If we are to assign 
five terms per item indexed, concepts will necessarily be 
broader (more inclusive) than if we are to assign twenty 
terms per item indexed. It may be that none of these index 
terms will characterize the item with a term that groups it 
with similar items in a way analagous to the groupings of a 
classification system. 

An example of this kind of problem may be found in the 

analysis of article IO 85 . 

Article 1085 is a collection of brief reports of 
various special representatives of the Special Lioraries 
Association. In the analysis of this article, analyst 4 
created only two verbal labels : 

4 1085 #CB SPECIAL LIBRARIES ASSN., SPECIAL 

REPRESENTATIVES’ REPORTS, 1966-67 

4 1085 #CB1 SPECIAL LIBRARIES, PROGRESS IN THE FIELD, 

SHORT REPORTS SLA, I 966 - 6 T 

These were analagous to a classification of the content of 
the article. The other analysts created verbal labels for 



the reports and for the subjects touched on by the reports 
and thereby created many more verbal labels than analyst 4* 
labels analogous to index entries. 

These two approaches seem to reflect a difference in 
the analysts’ perception of the usefulness of two different 
levels of concepts* one of which subsumes the other. 

In article 1005^ the hierarchic relationship between 
levels of verbal labels is not a permanent relationship. The 
reports and the concepts reported on could exist separately 
from the Special Libraries Association. The relationship is 
a relationship established within the context of the article 
and the context of the Special Libraries Association. 

In article 1233* there is a permanent relationship 
between the concepts that is not dependent on their con- 
catenation in the context of the article. The concept of 
LINGUISTICS and the concept of the various languages are 
related and do not exist in a non-related form. 

Other Studies of Tidexing Methodology 
in Which Categories Based on 
Synonymy Were Established 

The concept categorization process for this study* 
which is based on synonymy and the fuzzy set* can be related 
to the categorization process used in other indexing 
studies (not indexer consistency studies) in which the 
objective was to establish categories based on synonymy. 

Although this study does not attempt to use the 
concept categorizations established for it in any way 
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similar to the way that Montgomery and Swanson use the 
categories they established for their study of the feasi- 
bility of automatic assignment of subject headings from 

2 

titles for articles cited in the Index Medicus , it is 

relevant to compare them. 

Montgomery and Swanson wanted to establish the extent 
to which the article titles in their sample contained words 
which were "identical to - or near synonyms of - the subject 
headings (usually one word) under which the title appear (ed) 
in the Index Medicus . 

They therefore established categories of "functional 
synonyms" for the subject headings based on the words to be 
found in the titles under the headings. They stated that 
these words were "functional synonyms" for the subject head- 
ings and that any title containing one of these words could 
have been assigned automatically to the subject heading. 

In Table 3 on page 362 of their study, they give the 
following list of terms as functional synonyms for the 
subject heading ALLERGY: allergy(s), allergic, allergen(s), 

allergenic, allergology, hyperallergy, sensitization, 
sensitized, autosensitization, desensitization, hypersensi- 
tivity, autoimmune, reaction, reagin, anaphylaxis, anaphy- 
lactic, anaphylactoid. 

^Christine Montgomery and Don R. Swanson, "Machine- 
Like Indexing by People, " American Documentation XIII 
(October 1962): 359-366. 

3 Ibld., p. 359. 
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To judge from this example, the categories of 
functional synonyms established in the Montgomery and Swanson 
study encompass terms with a smaller degree of relatedness 
than would have been allowed in the concept categories 
established for this study. 

Another study which investigated the degree to which 

words in a title replicated (matched) or were synonymous to 

subject headings assigned to the title by human indexers was 

done by Kraft. 11 Kraft states: 

Interpretation of data revealed, among other things, that 
64. k% of the title entries contained as keywords one or 
more of the ILP [index to Legal Periodicals] subject 
heading words under which they were indexed; and 25.1# 
contained logical equivalents P 

Kraft grouped the titles in his study into five types 
based on five degrees of synonymy or replication. Types 
1 and 2 required replication of a word or a root form of a 
word that appeared in its subject heading for it to be 
counted as a ’’matching term 1 ’. 

Titles of Type 3 and Type 4 were described as follows . 

Type 3. A title which contains a synonym of its ILP 
subject heading. 

Example : 

ILP Heading: Atomic Energy 

Title: ’’Federal Organization for Licensing Major Nuclear 
Activities.” 

Since ’nuclear’ in common usage is a synonynigOf ’atomic 
energy’, this title Is considered as Type 3. 



^Donald H. Kraft, ”A Comparison of Keyword -in-Context 
(KWIC) Indexing of Titles With a Subject Heading Classifi- 
cation System. American Documentation XV (January 1964): 
48-52. 

5 Ibid., p. 48. 

6 Ibid., p. 49. 
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Type 4. A title not of Types 1, 2, or 3* but which 
contains keywords that would enable a legal researcher 
to find it in an obvious manner under a Ki.TC indexing 
system. 

Example : 

ILP Heading: Collisions at Sea 

Title: "navigational Lights of Warships of Special 

Construction: Laws Concerning. "7 

In his study, Kraft includes titles of Type 3 and 
Type 4 as "logical equivalents” to the subject headings 
assigned to them. This is a categorization based on synonymy 
since synonymy may be defined as "a word or expression 
accepted as a figurative or symbolic substitute for another 
word or expression. 

Type 3 synonymy would have been acceptable in the 
concept categorization process for this study. Type 4 
synonymy would not have been acceptable. 

These studies are mentioned here for two reasons. 

1. To demonstrate that synonymy of terms has been used as a 
basis for establishing replication of term in indexing 
studies other than indexer consistency studies. 

2. To demonstrate, by at least two non-indexer consistency 
studies, that the concept categorizations based on synonymy 
in this study require a greater degree of relatedness among 
terms included in the concept category than did these other 
studies. 



7 It>id. 

^The American Heritage Dictionary of the English 
Language^ William Morris, ed. (New York: American rferl tage 

Publishing Co., Inc., 1969 )> p. 1305. 
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CHAPTER V 
FINDINGS 

Display of Statistical Findings 

The findings of this study are based on the statis- 
tics arrived at through use of the formulas given in Chapter 
III. The statistics were arranged in tables. These tables , 
displaying statistics for each of the 10 pairs of analysts 
for each of the 25 articles in each of the 22 packets in the 
study* total 154 pages. Those for Packet XI are displayed in 
the seven pages comprising Table V - 1 which follow this dis- 
cussion. 

Table V - 1 displays the concept consistency scores 
(column 3) and the terminology consistency scores (column 4) 
for each of the ten pairs of analysts (identified by initials 
in column 2) for each article (identified by number in column 
1) in Packet XI. The arithmetic mean of the concept consist- 
ency scores and the arithmetic mean of the terminology con- 
sistency scores for each pair of analysts for each article 

are displayed In columns 5 and 6. 

Appendix G contains the tables for Packets VIII, IX, 
and X. Tables for the other packets in the study may be ob- 
tained from the investigator through 1977. As stated before, 
all of the raw data, tables, instructions and other materials 




lift 



i 

i 



i 



/ 



t 



I 



X- 



V 



'a 

3 



* 

1 



* 



TABLE Y - 1 
PACKET XI 

PERCENTAGES OF CON SISTENCY, 



Iarti- 

|CLE 

jNUl>i- 

Jher 


PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


w 

TERMIN- p 
OLOGY P 

CONS IS- |C< 

TENCY JO: 

te 


0011 . 


A3 and KC 


20.CKT" 


0.0% 


'A3 and WM 


33. 3S 


0.0% 


, 


A3 and MS 


40. oi 


25.0% 




A3 and INN 


SO. Of] 


0.0% 




KC ana WM 


14.85 . 


Q.03 




KC and MS 


lb. 75 


0.03 




KC and KW 


0 

6 

IT 


0.03 




WM and MS 


28.6S 


. 0.Q3 — 




MM and KW 


14.33 . 


0.03 




MS and K'U 


16.73 


0.0% 








1 0029" 


A3 and KC 


60.05 


0.0% 1 


A3 and WM 


14. 3&Z 


0.03 j 




A3 and MS 


28.05 


0. 0% 1 




A3 and KW 


23. 0)o 


o7o% j 




KC and WM 


42.9j|“ 


0.03 j 




KG and MS 


S7.13_ 


0.0% 1 




KC and KW 


SO. 03 


0.0% j 




Vi : ]/; and MS 1 


22.25 


0.0% j 




WM and KW 


20. 05 


0.0% j 




MS and KW 


62. S3 


0.0^ J 


| OOoF 


A3 ancl KC 


o.o^r 


0.0 % 1 


“TL and. wL 


60. 05 _ 


0.0% j 




A3 and MS 


so. Q3_ 


0.03 J 




A3 and KW 


33.33 . 


.. O.Qf — 




KG and WM 


33. 33 


0.0% 1 




KC and MS 


33. 3:5 ] 


1 0. 03 I 




KG and KW 


bb. 75 .. . 


0.0% 1 




WM and MS 


100. as 


0.0* j 




WM and KW 


66, 13l 


0. 0% 1 




MS and KW 

’ 


66 . 75? 


0.0% 1 


0103 


; AB and KC 


2b. o£~ 


o.oi j 


A "R rt\ (i WM 


40. 0% 


0. o%W_ 




AB and MS 


60. 0v> 


0.0% 




AB and KW 


33.33 


, 0.0%— 




KC and WM 


14.3S_ 


! 14.2* 




KC and MS 


25.05 


-.0. Q3 1 




KC and KW 


IB. 2FL_ 


0. 0*. — 1 




WM and MS 


14.33 


0.0* ._] 




WM and KW 


22. 25. _ 


0.0^ J 




MS and KW 


44 0 l V/o 


0. 0% I 



ARITHMETIC 
fctEAN OF TER- 
MINOLOGY 
CONSISTENCY 
)F ALL 
PAIRS 



28. 4< 



2.6* 



38 . 



o-o %. 



SO. 0* 



0. 0^ 



29 . 0 ^ 



1 . 4 # 
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1 



4 



1 
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1 



A. 



v 



5 

§ 



y. 

i 
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TABLE V - 1 (continued) 
PACKET XI 





FI 


•RCENTAGES 


OF CONSISTENCY 




ARTI- 

CLE 

NUM- 

BER 


PAIRS OF 
ANALYSTS 


r 

CONCEPT 

CONSIS- 

TENCY 


y 

TERMIN- Jl 
OLGGY ( 

CONS IS- < 
TENCY v < 

t 


ARITHMETIC W 
4EAN OF Sv 
CONCEPT f 
CONSISTENCY^ 

OF ALL C 
PAIRS a 


JttTHMETIC 
IEAN OF TER- 
IINOLOGY 
CONSISTENCY 
)F ALL 
3 AIRS 


o 

H 

cc 


AB and KC 


WM . 


0.0% 1 








/VR and YJM 


WMl. 


0.0% : 








A R and MS 


16.7% 


0.0% J 








AR and KW 


BO. 0% _ 


0.0% 








KC and YJM 


33.3% _ 


0.0% 








TCH and MS 


16 .1&. . 


0.0% ' 








Ten and KW 


33.3® j. 


0.056 








YM and MS 


0.0'S 




1 






I-JTvT and. KW 


11.1&- 


0.0® . 








MS and KW 


12.5&_. 


0 .0® 


23.1® 1 


'0.0® 


1 






-.-r-J 


0253' 


AR and KC 


PO. 095 


0.0% 1 






AB and YJM 


22.2% 


0.0® i 






' 


AB and MS 


50.0% 


o-oj_ 








AB and KW 


30.35 


0.0% 








KC and WMl 2b. 6 % 


o 7 o% 








KC and MS 


42.9% 1 


o7o% 








KC and KW 


25.0^~ 










WM and Ms 


50. 0$ 1 


"50 .0% 








YJM and kvI 


27 . 3T~- 


0 .0% 








MS and KW 


WM " 


0 .0% 


33.3® 1 


I 5.0% 


—T555BI 


AB and KC 


WoT 


1 0.0® 1 








AB and KS1 


04 


0.0% • 








AB and MS 


. ziM 


0.6% 








AB ' and KYJ 


25.6% 


0.0% 








KC ana WM 


0.05 


0.0$“"" 








KC and MS 


25,oJ7 


0 ,6°/o 








KC and KYJ 


25.0% 


0.0$ 








WM and MS 


lbTTf" 


0 • O/o 








YJM and KVJ 


53 %: 


0.0% 


* 






MS and KW 


1275% 


0.0% 


18.8® 


0.0® 


0323 


AB and KC 


30.0i7 


■ ■ 0.0"® 






AB and WM 


iC.3% 


"""■ o.o^r 








AB and MS 


50.0% „ 


0.0% 


4 






AB and KYJ 


SB. 3% 


0.0% 


• 






KC and WM 


0.0% 


0.0% " 


s 






KC and MS 


37. MI 


0.05 








KC and KYJ 


- 25., oil 


0.0% 








YJM and MS 


20.0% 


33.3IZ 


, • * 






YJM ana KVi 


f 11.1% 


0.0% 








MS and KY; 


r 4c.o% 


0.6% 


26 .1$ 


3.3® 
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TABLE V - 1 (continued) 
PACKET XI 



PERCENTAGES OP CONSISTENCY 




|ARTI- 

jCLE 

NUM- 

BER 


PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


TERM IN- 1 
OLOGY I 

mMCTC- V 

-L-^J J 

TENCY 1 

1 


ARITHMETIC i 
MEAN OF jl 

CONCEPT i 

CONSISTENCY 
OF ALL 
PAIRS 


ARITHMETIC 1 

MEAN OP TER -1 
MIMOLOGi 1 

CONSISTENCY 1 

OF ALL 

PAIRS 


0329 


AB and KC 


4b.. on . 




36.9% 


0. Qfo 


AN and WM 


Q*£SL- 


o-0%. 


AB and MS_ 


100.03 


Q-Q% 




A.B snd KW 




0-0%- 


KC and WM 


io.o>r 


0.0% 


I\C and MS 


40 . 0 % 


0.0% 1 


KC and KW 


50.0T 


o76% 


WM and MS 


o.o^r 


0.0% 


"WM and KW 


aTB: 


* 0=0% : 


MS and KW 


60 . 0 % 


0.0% 








033^1 


I AB r and KC 


75 . c; 


0.0% 


44 . 5 % 


5 . 0/0 


AB and WM 


50. on 


0.0% 


AB and MS 


33 .. 3 S 7 


16 . 7 % 


AB and KW 


75_. Q?4j 


0.0% 


KC and WM 


4 c! 6% 


33 . 3 % . 


KC and MS 


28. &)0 


0.0% 


KC and KW 


60 . 0% 1 


■0.6% 


WM and MS 


14 . 3f I 


0.6% 


WM and KW 


40 . 0% 


1™ 676% 


MS and KW 


2 &. Wo 1 


0.6% 




; 




0395 


AB and KC 


20. Ofo 


o.osr. 


21. 0% 




AB and WM 


i5. 3 % 


0.0% v ! 


AB and MS 


20. 0% 


0.0% 


AB and KW 


28 .& 


0-0% ■ 


KC find WM 1 


25 „ 0 $_ 


25 . 0 % 


KC and MS 


14 .3SL 


12.5% 


KC and KW 


22.2% 


O.Q% 


WM and MS 


2B.0%_ 


22.2% 


WM and KW 


18. 2% 


. 0-0% 


MS and KW 


22 . 2 % 


0-0% 








0425 


AB and KC 


40 . W 


0.0$ 


t 

\ 


0.0$ 


AB and WM 


66. 7 % 


0.0% 


AB and MS 


80 . 0% 


_ cua 


! AB and KW 


28.6% 


0.0% 


! KC and WM 


16. 7 % 


1 0.0% 


KC and MS 


20. 0% 


0J2L 


KC and KW 


50 . 0 ® 


0.0% 


WM and MS 


80.0% 


0.0% 


• . 


WM and KW 


InsI 


0.0% 




MS and KW 


14 . 3% 


0.0% 








40 . 9 ^ 
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TABLE V - 1 (continued) 
PACKET XI 

PERCENTAGES OF CONSISTENCY 



ARTI- 

CLE 

NUM- 

BER 


PAIRS OP 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


l 

TERM IN- i 
OLOGY i 

GONSIS- * 

TENCY j 


ARITHMETIC 
MEAN OP 
CONCEPT 
CONSISTENCY; 

OF ALL 

PAIRS 


ARITHMETIC I 
MEAN OF TERj 
MINOLOGY 
CONSISTENCY 

OP ALL 
PAIRS 


o54i 


AB and KG 


40 . 0 % 


To? : 






AB and UM 


0.0:0 


0 . 0 % 








AB and MS 


50.0% 


' 0 . 0 % 








AB and KW 


33.T „ 


oTo® 








KC and UM 


0 . 0 % 


0.0% 








KC and MS 


25.0% . 


0 . 0% ' 








ICC and KW 


3Qc0iL_ 


0.0% 








UM and MS 


20.0% 


0.0% 








UM and KW 


15. 45_ i 


0. 05 * 








MS and KW 


.66.721 


0.0% . 


28 . 0 % 


0.0% 








1 


04b6 


AB and KC 


is .5|“ 


o7o% ; 








A3 and UM 


25 .q|_ 


oTo% 








AB and MS 


56. 0 % 


0 . 0 % 








AB and KW 


25 • 0 % 


o7o% 1 








KC and UM 


14.3?» 


oTofa 








KC and MSj 


2 5.o5~~ 


o7o% 








KC and KW! 


l&, 2 jS , 


o7o% 








UM and MS 


20.0^1 


20 . 0 % 








UM and KW 


12.5 §.J 


o7o% 








MS and KW 


10.0% 


To% 


20 . 3 ^ 


2.05 






. 




0516 


AB and KC 


50.04 


0 . 0 % 






AB and UM 


28.6% 


0 . 0 % • 








AB and MS 


42. 


0 . 0 % 








AB and KW 


37 ,- 5 %- 


o 7 o% 








KC and UM 


28. 6% 


0 . 0 % 








KC and MS 


42 . 9 


o7o% 








KC and KW 


37.54 


oTo? 








WM and MS 


627 56 


22 . 2 % 


* 






UM and KW 


55.55 


0 . 0 % 








MS and KW 


66.7% 


o7o% 


. ^.35 


g.g* - 










0552 


' AB and KC 


75 . on; 


0.0% ‘ 






AB and UM 


40.0% 


0.0% 








AB and MS 


60 . 0 ><r 


o7o% 


* 






AB and KW 


42 . 95 


o7o% 


* 






KC and UM 


. 33 


0 . 0 % “ 








KC and MS 


50. 0% 


0 . 0 % 








KC and KW 


57. ill 


oTo% 








WM and MS 


28. 6% 


14. 3% 


- - 






UM and KW 


. 37.55 


sa.bi- 








MS end KW 


50.0% 


9705 


1 . | , _/ 


5.25 










47.4% 
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TABLE V - 1 (continued) 
PACKET XI 



ARTI- 

CLE 

NUM- 

BER 


PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


ur ouivo-loj 

TERRI IN- J 
0L0GY t 

CONSIS- 
TENCY j 


0571 


AB and KC 


23. _ 


oTWo ’ 


AB and WM 


26. 6% 


0.0% 


AB and MS 


iO % 


o% 


AB and KW 


66. 7 r JZ 


0 . 0% 


KC and WM 


41.7% [ 


14. 2% _ 


KC and MS 


so. m 


0.0% 


KC and KW 


28. 6% 


0 . 0% 


WM and MS 


40. 0% 


0.0% 1 


WM and KW 


22 .2% ■_ 


0.0% 


MS ana KVJ 


23. 0% 


0.0% 








059® j 


AB and KC 


100. "0%T“ 


0.0% 


AB and WM 


50. 0 : K 


o.o$~ 


AB and MS 


h- 1 

O 

o 

• 

O 

O'* 


0.0$ j 


AB and KW 


40.0 fo \ 0.0% 


KC and WM 


30.0% ; 


0.0% 


KC and MS 


ioo.o$ 


07U% 


KC and KVJ 


40.0%"" 




WM and MS 


50. o%~" 




WM and KW 


2d. 6 L /o 


575% 


MS and KW 


40.0$“" 


0.0% 




• 




+ 0765 


AB and KC 


60.0% 


0.0% 


AB and WM 


14.3%1 


0.0% 


AB and MS 


37.5%__ 


0.0% 


AB and KW 


50. ok. < 


0.0% 


KC and VIM 


14. 3 W 


0.0% 


KC and PIS 


57.l|_ 


0.0 1 ~ 


KC and KWi 


50. 0% 


0.0% 


WM and MS 


22.2% 


11.1% 


WM and KW 


" 9.1%_ 


0.0% 


MS and KVJ 


36.4% 


0.0% 








Ob 33 


AB and KC 


66.7?% 


0. 0$ 


AS and VIM 


18.2% 


0.0% 


AB and MS 


22.2% 


0.0% 


AB and KW 


44.4% 


0.0% 


KC and WM 


loto H 


Q. m— 


KC and MS 


12 . 3% 


0.0% 


KC and KVJ 


37.5% 


o7o% 


! VIM and MS 


20.0% 


0.0% 


WM and KVJ 


27.3% 


o7o% 


MS and KVJ 


20.0% 


o7o% 









(pairs 



(ARITHMETIC 
[MEAN OP ter- 
minology 

jCONSISTENCY 

jOF ALL 

(pairs 



3 



1. 



32 ^ 



3-3% 



1.15 
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TABLE V - 1 (continued) 
PACKET XI 
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TABLE V - 1 (continued) 

PACKET XI 

PERCENTAGES OF CONSISTENCY 



ARTI- 

CLE 

NUM- 

BER 


PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


T 

TERM IN- | 
OLOGY {< 

CONS IS- | 
TENCY 1 

1 


iRXTHMSriC [ 

MEAN OP « 

CONCEPT 

OONSIS'TE’''CIt 

OF ALL 

PAIRS 


ARITHMETIC 
MEAN op TER- 
MIN 01 IY 
CONSISTENCY 

OF ALL 

PAIRS 


w 


AB and KC 


BO. OCT 


33.3fr 1 


32- 9i* 


7.2$ 


AB and VIM 


25.0|_. 


0.0% 


"AB and MS 


557 Wo 1 


0.0% 


AB and KV7 


57, id-. 


0. 0/o 


KC and VIM 


18. 


10.0% 


KC and MS 


88.3% 


20.0% : 


£ C and KVJ 


28. 6% 


0. 0% ' 


VIM and MS 


4l!.7 5_ 


EM - 


WM and KVI 


20.0% ' 


0.0* 


MS and KVJ 


50,055 


0.0% 






- - - 1 


I 










• 


. 


































* 








... 






i — - - 








* 
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• 






’ 
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used in this study will be retained for at least this five 
year period to allow other researchers to check the findings 
and because they nay be of use for further studies* 

In addition to the tables described above, tables of 
the percentile ranges for the mean inter-indexer concept con- 
sistency, the mean inter-indexer terminology consistency, and 
the number of percentage points difference between the two, 
were constructed for each article for all of the packets. 

All of these tables will be found in Appendix H. 

Comparison of the Statistical Findings 
of Concept Categorizations Done by 
Different Categorizers 

Twenty of the 22 packets of articles in this study 
were concept categorized by one person, the investigator. 

The four packets for which the tables of percentages of con- 
sistency are displayed in full, (Packets VIII, IX, and X in 
Appendix G, and Packet XI on the preceding pages), were cho- 
sen for display because they confirm the statement that the 
pattern of and the relationships between the consistency per- 
centages do not vary from packet to packet with the person 
who is doing the concept categorization. The patterns and 
relationships are similar for every packet, even though the 
concept categorizations were done by different people. Pack- 
ets X and XI were categorized by the investigator. Packet 
VIII was categorized by someone else, and Packet IX was cat- 
egorized by still another person. 



i 

i 






I 

1 



V 

"ii 



w 




m 



!; -*■*«*.*• 



Each of the categorizers had been an indexer for the 
study. None had analyzed the materials in the packet the 
indexing of which she was asked to categorize. Each was 
asked to do the categorization in accordance with the in- 
structions in Appendix E. 

Table V - 2 displays the percentile ranges for the 
mean concept consistency scores and the mean terminology con- 
sistency scores for Packets VIII , IX , and X, 

As can be seen in Table V - 2, for each of these 
packets , the mean concept consistency scores cluster near the 
middle of the percentile ranges. The mean terminology con- 
sistency scores cluster at the low end of the percentile 
ranges. There was no instance in these packets, or indeed, 
in any of the packets in the study, in which the concept con- 
sistency score was lower than the terminology consistency 
score; and the number of percentage points difference between 
the two consistency scores in these three packets was never 
less than 14.2 and ranged as high as 78 * 0 . 

Since the categorizers differed in experience, educa- 
tion, and points of view, it might be supposed that this 
would create bias in their concept categorizations and that 
therefore the findings for the packets categorized by differ- 
ent people would show variations in pattern. 

The findings for Packets VIII, IX, and X did not vary 
in any significant or substantive way from the findings of 
the other packets in the study, even though they had each 
been categorized by different people. Since the comparisons 
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TABLE V - 2 



PERCENTILE 


RANGES 


FOR ARTICLES 


IN PACKETS 


VIII, 


IX, AND 


X 


PERCENTILE 

RANGE 


PACKET VIII 
CC* TC** 


PACKET IX 
CC* TC** 


PACKET X 
CC* TC** 


TOTALS 
CC* TC** 


0.0 


- 


0.9 


0 


4 


0 


14 


0 


17 


0 


35 


1 . 0 


- 


10.9 


0 


19 


0 


10 


0 


8 


0 


37 


11.0 


- 


20.9 


4 


2 


1 


l 


1 


0 


6 


3 


21. 0 


- 


30.9 


9 


0 


11 


0 


6 


0 


26 


0 


31.0 


- 


4 o .9 


7 


0 


8 


0 


10 


0 


25 


0 


4 i.o 


- 


50.9 


4 


0 


2 


0 


3 


0 


9 


0 


51.0 


- 


6 o .9 


0 


0 


3 


0 


5 


0 


8 


0 


61.0 


- 


70.9 


0 


0 


0 


0 


0 


■ 0 


0 


0 


71.0 


- 


80.9 


1 


0 


0 


0 


0 


0 


l 


0 


81.9 


- 


90.9 


0 


0 


0 


0 


0 


0 


0 


0 


91.0 


- 


100 


0 


0 


0 


0 


0 


0 


0 


0 



* Mean concept consistency 

** Mean terminology consistency 



TABLE V - 3 
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PERCENTILE RANGES FOB MEAN I NTER- INDEXER CONCEPT CONSISTENCY 
AND MEAN I NTER -INDEXER TERMINOLOGY CONSISTENCY FOR 
ALL ARTICLES IN THE STUDY 



PERCENTILE MEAN CONCEPT MEAN TERMINOLOGY 

CONSISTENCY CONSISTENCY 



0.0 


- 


0.9 


0 


200 


1.0 


— 


10.9 


1 


312 


11.0 


- 


20.9 


24 


3ft 


21.0 


- 


30.9 


113 


4 


31.0 


- 


40.9 


198 


0 


41.0 


- 


50.9 


136 


0 


51.0 


- 


6o.9 


61 


0 


6l.o 


- 


70.9 


12 


0 


71.0 


— 


80.9 


4 


0 


81.0 


- 


90.9 


1 


0 


91.0 


- 


100 


0 


c 



V 
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were based on statistical results, and there were no expecta- 
tions that the exact categorizations established by one cat- 
egorizer would be reproduced by a different categorizer, in- 
dividual packets were not categorized more than once. The 
fact that the statistical findings of packets categorized by 
different people showed the same pattern for each packet is 
an indication of the validity of the methodology- In the 
discussion that follows , these packets will not be treated 
separately from the other packets in the study . 

Percentile Ranges for Mean Concept 
Consistency Scores and Mean Terminology 
Consistency Scores for All Articles in Study 
Table V - 3 displays the percentile ranges for the 
mean concept consistency scores and the mean terminology con- 
sistency scores for all of the articles in the study • It 
will be discussed in the following section of this chapter. 

For each packet, a similar pattern emerges in the 
percentile ranges for mean inter-indexer consistency in per- 
ception of concept scores and for mean inter-indexer consist- 
ency in choice of terminology scores. The fact that the pat- 
tern repeats itself for each individual packet and for all ’ 
packets in the study taken as a whole, even though each pack- 
et hac’. different articles, a different combination of index- 
ers, and in two cases, different categorizers, provides a 
further check on the validity of the methodology. For each 
packet, and for all packets in the study taken as a whole, 
the mean inter-indexer concept consistency scores cluster at 
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11.0 percentage points higher than the mean inter-indexer 
terminology consistency score. 

The number of percentage points difference between 

mean concept consistency scores and mean terminology consist- 
ency scores was never less than 5«1 and ranged as high as 

84.0 percentage points difference. 

Percentile Ranges for the Number of Percentage Points 
Difference Between the Mean Concept Consistency 
Score and the Mean Terminology 
Consistency Score 

Table V - 4 displays the percentile ranges for the 
number of percentage points difference between the mean con- 
cept consistency score and the mean terminology consistency 
score for each article in the study. These were derived by 
subtracting the mean terminology consistency score for each 
article from the mean concept consistency score for the arti- 
cle , thus arriving at a measure of the number of percentage 
points difference between them. 

The fact that the mean concept consistency scores were 
always higher than the mean terminology consistency scores, 
and that, for 500 of the 550 articles, the mean concept con- 
sistency score was 21.0 or more percentage points higher than 
the mean terminology consistency score shows that a gross 
difference exists between these two facets of subject index- 
ing — a difference that has not been investigated in the 
past because of the previous approach to the measurement of 
inter-indexer consistency which did not attempt to differen- 
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the middle of the percentile ranges. The mean inter- indexer 
terminology consistency scores cluster at the low end of the 
percentile ranges • None of the mean concept consistency 
scores are in the 0.0$ to 0.9$ percentile range. Two hundred 
of the mean terminology consistency scores are in the 0.0$ to 
0.9$ percentile range. 

Although there were individual pairs of analysts in a 
small number of articles who scored 0.0$ on concept consist- 
ency, there was no article for which the mean concept con- 
sistency was lower than 9->4$. There were l8l articles, at 
least 2 in each packet, for which the mean terminology con- 
sistency was 0.0$. 

Of the 550 articles in the study, 512 had a mean in- 
ter-indexer terminology consistency score of 10.9$ or less. 
Only one of the 550 articles had a mean inter-indexer concept 
consistency score of 10.9$ or less and only 25 had a mean 
inter-indexer concept consistency score of 20.9$ or less. 

Five hundred forty-six had a mean terminology consistency 
score of 20.9$ or less. 

Of the 550 articles in the study, only 4 had a mean 
inter-indexer terminology consistency of 21.0$ or more. Five 
hundred twenty-five of the 550 articles had a mean inter- 
indexer concept consistency score of 21.0$ or more. 

There was no instance in which the mean inter-indexer 
concept consistency score was lower than the mean inter- 
indexer terminology score. In 545 of the 550 articles, the 
mean inter-indexer concept consistency score was at least 
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TABLE V - 4 

IIUT.-3ER OF PERCENTILE POINTS DIFFERENCE BETWEEN TEE MEAN 
INTER- INDEXER CONCEPT CONSISTENCY SCORES AND THE MEAN 
' INTER -INDEXER TERM IN OLOGY CONSISTENCY SCORES FOR 
ALL OF THE ARTICLES IN THE STUDY 



PERCENTILE 

RANGE 



NUMBER OF ARTICLES IN 
EACH PERCENTILE RANGE 



0.0 - 


0.9 


0 


1.0 - 


io.9 


5 


11 o - 


20.9 


45 


21.0 - 


30.9 


152 


3i.o - 


4o.9 


183 


41.0 - 


50.9 


109 


51.0 - 


60.9 


42 


61.0 - 


70.9 


12 


71. 0 - 


80.9 


l 


81.0 - 


90.9 


1 


91.0 - 


100 


0 
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tiato between the two facets of indexing, but encompassed 



them both in a single measurement. 



Articles with High Mean Concept Consistency Scores 

Table V - 5 displays the consistency scores for uhe 
17 articles in the study that had a mean concept consistency 
score of 6l.O $ or higher. Six of these 17 had a mean termi- 
nology consistency score of 0.0$. Only 3 had a mean termi- 
nology consistency score of 10.0$ or higher. 



Articles with High 
Mean Terminology Consistency Scores 
Table V - 6 displays the consistency scores for the 
1 6 articles in the study that had a mean terminology consist- 
ency score of 15.0$ or more. The lowest mean concept con- 
sistency score for this group of articles was 28.0 $ and 11 
of the lb articles had a mean concept consistency score of 
40.0$ or higher. In this group of "high" terminology con- 
sistency scores, only two articles had a higher consistency 
in terminology than the lowest of the concept consistency 
scores . 
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TABLE V - 5 



ARTICLES 



ARTICLE 

NUMBER 



1240 

1096 

1193 

1232 

1099 

0557 

1034 

0545 

1039 

0636 

0383 

0712 

0398 

0742 

0250 

0909 

0267 



WITH HIGH MEAN CONCEPT CONSISTENCY SCORES 
(6l.O^ or above) 

PACKET MEAN CONCEPT MS AN TERMINOLOGY 

NUMBER CONSISTENCY SCORE CONSISTENCY SCORE 



I 


62.5^ 


0.0* 


II 


74.0 % 


10 . 7 % 


III 


84.0$ 


0 . 0 % 


III 


66 . 6$ 


2.5$ 


IV 


65.6$ 


1.5* 


V 


64 0 7 # 


3.3$ 


V 


62 . 3 $ 


1.5* 


VII 


62 . 2 $ 


0.0* 


VIII 


78.0$ 


0.0* 


XII 


66 . 7$ 


0.0* 


XIV 


73 oO $ 


16.3* 


XV 


6 1 . 0 $ 


1.7* 


XVI 


72 . 5 $ 


10.0* 


XVI 


62.7$ 


5.0* 


XVII 


63.3$ 


0.0* 


XVII 


70 . 0 $ 


2.0 % 


XXII 


66.7$ 


1 . 7 ?° 
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TABLE V - 6 

ARTICLES WITH HIGH MEAN TERMINOLOGY CONSISTENCY SCORES 

( 15 . 0 $ or above) 



ARTICLE 


PACKET 


MEAN CONCEPT 


MEAN TERM IN 


NUMBER 


NUMBER 


CONSISTENCY SCORE 


CONSISTENCY 


1094 


I 


56.0$ 


19.9$ 


1069 


IV 


54.1$ 


25.7$ 


0073 


VI 


44.3$ 


18.9$ 


0047 


XIII 


39. 5$ 


15.3$ 


0132 


XIII 


56 . 7 % 


16.4$ 


0346 


XIII 


42. 8$ 


15.2$ 


0412. 


XIII 


53.8$ 


15.0$ 


0044 


XIV 


35. 8 $ 


15.0$ 


0383 


XIV 


73.0$ 


IS. 3$ 


0S88 


XV 


4i. 0$ 


19. 0$ 


0724 


XVII 


57.5$ 


. 30. 0$ 


0678 


XIX 


28. 0 $ 


15.8$ 


0910 


XIX 


52.0$ 


15.8$ 


0396 


XXI 


36.4$ 


29.2$ 


0409 


XXI 


40.0$ 


21.7$ 


0263 


XXII 


39.0$ 


18.6$ 
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It seemed important to try to ascertain if there was 
a bias in the data that w ou.lu leave liad an ini luccco ^n Llee 
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bindings cf this study. Therefore > all o.f the texts anc all 
of the anculysts ' verbal labels for the articles that appeared 
in Table 7-5 (Articles with high mean concept consistency 
scores) and Taole 7-6 (Articles with high mean terminology 
consistency scores) were subjected to a gross examination to 
see if ary of the following variables could be identified as 
distinguishing the articles in one table from the articles 



in the other: 

1. Number of verbal labels created by each analyst; 

2. Number of ’’name" or "name-like’ verbal labels; 

3. Degree of analysts' comprehension of text as indicated 
on data gathering sheet; 

4. Number of "sentence-like" verbal labels; 

5. The presence or absence in the analysts' verbal labels 
of the concepts or terminology used in the sub-heads of the 
article; 

6. The presence or absence in the analysts ' verbal labels 
of concepts or terminology used in the title of the article; 

7. The length of the article. 

None of these variables could be said to be distinctive of 



one group or the other. 

There seemed to be no relationship between high con- 
cept consistency and high terminology consistency. Only one 
article appears in both the high (6l.0$ or above) mean con- 
cept consistency table and the high (15*0$ or above) mean 
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"t 0 rminol cgy consistency table* This is article numbei 0^83^ 
i'8,(’kfi'c XIV j George Douglas l'layo and Alexander. A. Longo, 
’’Training Time an I Programed Instruction”, Tcurnal cf Applied 
Psychology 3 L (February 1966) 1-4. This investigator could 
find no distinguishing characteristics in the text of the 
article or in the verbal labels of the analysts that could 
account for the fact that the article had both a high mean 
concept consistency score and a high mean terminology con- 
sistency score. 



Articles with 6l.O Percentage Points or M ore 
Difference Between the Mean Concept 
Consistency Score and the Mean 
Terminology Consistency Score 
Table V - 7 displays the mean consistency scores of 
the 13 articles in the study that had a difference of 6l.O 
percentage points or more between the mean concept consist 
ency score and the mean terminology consistency score. All 
of these articles also appear in Table V - 5 (Articles with 
high mean concept consistency scores ) . This was oo be ex- 
pected of course , since all the articles in Table V - 7 would 
have to have a mean concept consistency score of 61.0% or 
above . None of the articles in Table V - 7 appear in Table 
V - 6 (Articles with high mean terminology consistency 

scores) . 
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TABLE V - 7 

ARTICLES WITH 6l. 0$ OR MORE DIFFERENCE BETWEEN THE MEAN 
CONCEPT CONSISTENCY SCORE AND THE MEAN TERMINOLOGY 

CONSISTENCY SCORE 



ARTICLE 


PACKET 


NUMBER 


NUMBER 


1240 


I 


1096 


II 


1193 


III 


1232 


III 


1099 


IV 


0557 


V 


0545 


VII 


1039 


VIII 


0636 


XII 


0398 


XVI 


0250 


XVII 


0909 


XVII 


0267 


XXII 



MEAN CONCEPT 
CONSISTENCY SCORE 

62. 5$ 

74. 0$ 

84.0$ 

66 . 6 $ 

65.6$ 

64.7$ 

62.2$ 

78.0$ 

66. 7$ 

72 . 5 $ 

63.3$ 

70. 0$ 

66.7$ 



MEAN TERMINOLOGY 
CONSISTENCY SCORE 

0 . 0 $ 

10.7$ 

0 . 0 $ 

2.5$ 

1.5$ 

3.3$ 

0 . 0 $ 

0 . 0 $ 

0 , 0 $ 

10 . 0 $ 
o.c$ 

2 . 0 $ 

1.7$ 
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Sc ore and the Mean r ~ e rminology Consistency Score 
Table V - 8 displays the mean consistency scores of 
the 11 articles in the study with a difference of 15.0 per- 
centage points or less between the mean concept consistency 
score and the mean terminology consistency score. None of 
these articles appear in Table V - 5 (Articles with high 
mean concept consistency scores) but two appear in Table V - 
6 (Articles with high mean terminology consistency scores). 
These two articles were Article 0678, Packet XIX: Emmet N. 

Leith. "Holography — Lenseless 3D Photography," Industrial 

> 

Research (August 1966): 41-43 . 3 and Article 0396, Packet XXI:' 
"Office for Scientific and Technical Information," Chemistry 
in Britain (1967) : 17-18. The texts and analysts’ verbal 
labels for articles appearing in Table V - 3 were examined 
for the variables listed previously, and again, none of these 
variables could be said to be characteristic of this group 
in particular. However, as a group, the mean concept con- 
sistency scores for the articles in Table V - 8 were lower 
than the mean concept consistency scores for the articles in 
the study as a whole, 7 of the 11 articles having a mean con- 
cept consistency score of 20.9 % or less. 
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TABLE V - 8 



ARTICLES WITH 15.0 OR LESS PERCENTAGE POINTS DIFFERENCE 
BETWEEN THE MEAN CONCEPT CONSISTENCY SCORE AND THE MEAN 

TERMINOLOGY CONSISTENCY SCORE 



ARTICLE 


PACKET 


NUMBER 


NUMBER 


0872 


VIII 


0960 


VIII 


0294 


X 


06 05 


XII 


0289 


XII 


0112 


XVIII 


O678 


XIX 


0319 


XX 


0502 


XX 


0396 


XXI 


0232 


XXII 



MEAN CONCEPT 
CONSISTENCY SCORE 

25.8$ 

14.95* 

14.40 

27.$$ 

9.40 

13.40 

28.00 

20.60 
17.90 

36.40 

13.00 



MEAN TERMINOLOGY 
CONSISTENCY SCORE 

11.60 

0.00 

0 . 0 % 

13.90 

2.10 

1.10 

15.80 

6.40 

12.60 

29.20 

7.70 
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Two articles in the above table share the -distinction 
of being the only ones in the study for which an analyst was 

unable to create verbal labels. Although the analysts' in- 
structions clearly stated that they could leave the data 
gathering sheet blank if they felt they could not analyze an 
article for concepts, apparently only one analyst felt it 
necessary to do this. She did not create verbal labels for 
Article 0209* Packet XII: Jean M . Ferreault, "Coterminous or 
Specific: A Rejoinder to Headings and Canons," Journal of 
Documentation XXII (December 1966) : 319“32?’j and Article 
0605, Packet XII: C.K. Chow, and C.N. Liu, "An Approach to 
Structure Adaptation in Pattern Recognition, " IEEE Transac - 
tions SSC-2 (December 1966) : 73“8o. 

These are comparatively difficult and technical arti- 
cles . However, they are not any more difficult or technical 
than many other articles in the study, or even more difficult 
or technical than other articles which this particular ana- 
lyst had worked on in another packet. 




134 

135 



Validation of the Hypothesis 



This study was concerned with the definition of the 
term "indexer consistency" and with the use of this defini- 
tion in establishing quantitative measurements of indexer 
consistency. 

Previous studies had defined indexer consistency in 
terms of degree of replication in the index terms chosen in- 
dependently by two or more indexers ^ or by the same indexer 
at different times., to label the informational content of a 
given text as a means of providing access to the text. This 
definition of Indexer consistency presented it as an undif- 
ferentiated mix in which the two steps in the indexing proc- 
ess were unconsciously combined in an undifferentiated man- 
ner . 

The basic assumptions of this study were: 

1. That indexing is an order-dependent technique in that a 
concept must be perceived before it can be expressed in an 
index term; and 

2. That perception of concepts is a process distinct from 
the process of choosing terms with which to characterize the 
concepts perceived. 

This study therefore postulated that indexer consist- 
ency should be defined as having two parts : 

1. Indexer consistency in perception of indexable matter 

i £ 6 



(consistency in choice of subject concepts) ; and. 

2. Indexer consistency in' choice of term with which to 
label the indexable matter perceived. 

In the interests of clarity * throughout this study* 
the second part of "indexer consistency" has been referred 
to as indexer consistency in choice of terminology. Use of 
this terminological label in this way is consistent with its 
use in previous studies* where it represented an undifferen- 
tiated mix of concept and words* and has been useful tor the 
purposes of this study. It is necessary to make clear* how- 
ever* that form and function combine in index terms* as in 
language in general* so that although a concept may be sep- 
arated from the term used to describe it and may exist in a 
non-word form as exemplified, by a non-word symbol* or may be 
characterized by more than one terminological label* the 
words in an index term* by definition* represent the concept 
they are meant to characterize in the term* although they 
may also be used to characterize other concepts in other 
terms . 

In the index term itself* therefore* the form of the 
word and the function of the word are combined. The function 
of the index term is to represent the concept. The form of 
the index term is the actual word or words used. Therefore* 
in this study* "indexer consistency in choice of terminology" 
represents what was referred to in previous indexer* consist- 
ency studies as "indexer consistency"* and may be thought of 
as an overall measurement that combines both kinds of con- 
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sistency. The separation of this kind of indexer consist- 
ency from indexer consistency in choice of concepts has been 
the focus of this study. 

The hypothesis to be tested was that the degree of 
? ndexer consistency in the perception of indexable matter 
can be measured separately from and will be different in 
extent from the degree ol indexer consistency in the termi- 
nology chosen to characterize that indexable matter. 

Because indexing is an order dependent process* in 
that indexable concepts must be perceived before they can be 
expressed in words* there was no expectation that indexer 
consistency in choice of terminology would exceed indexer 
consistency in perception of concepts . Two possibilities 
remained : 

1. That indexer consistency in choice of terminology would 
equal indexer consistency in perception of concept* or 

2. That indexer consistency in choice of terminology would 
be less than indexer consistency in perception of concept. 

If v'he findings of this study had been that overall 
indexer consistency that is* what has been called indexer 
consistency in choice of terminology* was equal to or only 
marginally less than indexer consistency in perception of 
concept* the study might have been inconclusive* and the 
hypothesis not substantiated. However* in this study* for 
500 of the 550 articles in the sample* there was a difference 
of 21.0 percentage points or more between the mean overa.ll 
indexer consistency as represented by the terminology con- 
sistency score and the mean indexer consistency in perception 
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of concept score. In only five articles was the difference 
between these two scores less than 10 percentage points. 

Thus , the consistency with which the analysts identi- 
fied concepts in the articles was always significantly 
higher than the consistency with which they chose terminol- 
ogy to characterize the concepts they perceived. This was 
true for each of the 550 articles in the study and for all 
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of the analysts in the study. 

Each packet of 25 articles presented the above pat- 
tern. It did not vary with variations in the education or 
work* experience of the analysts, with the contents of the 
packets, or with the categorizers who established concept 
categories. Each grouping of articles (those with "high” 
mean concept consistency; "high" mean terminology consist- 
ency; "high" difference between mean concept consistency and 
mean terminology consistency; and "low" difference between 
mean concept consistency and mean terminology consistency) 
contained articles from many different packets. 

"Low" mean concept consistency scores and "low" mean 
terminology consistency scores were not compared since 512 
of the 550 articles had a mean terminology consistency score 
of 10.9$ or less, while only 1 of the mean concept consist- 
ency scores fell in this category. 

Because no official list of terminology was given to 
tre indexers, a high degree of consistency in choice of ter- 
minology was not expected. On the other hand, all of the in* 
dexers had been educated in the same subject discipline and 



/ 



V 



o 

ERIC 



139138 



' * tie# ' ' i * > * • ■ +* Ul04 ‘ 



4 



o 

ERIC 



therefore hah a common professional vocabulary; all were 
told to be specific, not generic in their choice of terms; 
and the degree of difference between the mean concept con- 
sistency scores and the mean terminology consistency scores 
(more than 21.0 percentage points for 500 of the 550 arti- 
cles) was a gross difference. 

The instructions given the indexers on how to choose 
indexable matter from text were more explicit than the in- 
structions given them for -choice of terminology, but the in- 
structions did not indicate either what kind of concepts 
were to be considered indexable or the number of concepts 
that should be identified for each article. 

The indexers were told to be exhaustive, not '.elec- 
tive, in their choice of indexable concepts. They were told 
to name all the concepts in each article on which useful in- 
formation was given. They were given a generalized context 
for their work: a library or information center containing 
materials on information science, documentation, and librar- 
ians hip . 

Given the large differences shown by the data, and 
making allowances for possible statistical error, it appears 
evident that the two indexing steps studied are, as Bernier 
and others have noted, distinct; that they can be measured 
separately; that they differ significantly in degree of con- 
sistency; and that the definition and measurement of indexer 
consistency should reflect this. 

The experimental instruments and methods used in this 
study were not highly precise in the statistical meaning of 
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the word, however, they were statistically accurate. The 
findings show such a largo difference between degree of 
inter-indexer consistency in perception of indexable matter 
and degree of inter-indexer consistency in choice of termi- 
nology with which to describe the indexable matter perceived, 
that there seems to be no question that these are separate 
entities and can and should be considered separately. Higher 
precision, although desirable, is not necessary for the pur- 
poses of this study. 

Inter-indexer consistency in choice of concept and 
inter-indexer consistency in choice of terminology have not 
been separately considered in previous consistency studies 
nor have they been separately measured in the past. The 
point of this study was to do so. The need for the devising 
of more precise instruments of measurement and for further 
research in this area is evident. 
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CHAPTER VI 



SUMMARY* CONCLUSIONS, AND IMPLICATIONS OF THIS STUDY 

Summary and. Conclusions 

This study was concerned with the definition of the 
concept known as ’’indexer consistency” and with the use of 
this definition in the quantitative measurement of indexer 
consistency. 

Previous studies had defined indexer consistency as 
equal to the quantitative measure of the degree of match or 
replication (however this was defined) in the terminology 
chosen independently by two or more indexers , or by the same 
indexer at different times, to characterize the concepts the 
indexer(s) had perceived as indexable matter in the text. 

Although analyses of the indexing process include 
these two major steps: 

1. The identification of indexable matter in texts; and 

2. The characterization of this indexable matter in words; 
previous studies of indexer consistency do not explicitly 
consider these two parts of the indexing process separately. 
They make no explicit distinction between them in their final 
measurement of indexer consistency, although some of the stu- 
dies show an awareness of the distinction between the two 
parts in their varying definitions of what may be considered 
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a "match" in terminology. The effect of this is that previ- 
ous measurements of indexer consistency result in indexer 
consistency scores that commingle 3 Ln an uncontrolled and 
undifferentiated way,, the two aspects cf the indexing proc- 
ess. Indexer consistency in perception of indexable concepts 
in texts and indexer consistency in choice of terminology 
with which to characterize the indexable concepts perceived 
are not measured or expressed as separate parts of the prob- 
lem of indexer consistency. 

This study postulated: 

1. That indexer consistency should be defined as consisting 
of two distinct parts; 

a. Consistency in identification of indexable matter 
(perception of concepts in texts) ; and 

b. Consistency in choice of terminology with which 
to label and communicate the concepts perceived; 

2. That these can be measured separately; 

3. That there wall be a gross difference in the degree of 
each; and 

4. That indexer consistency scores should be determined by a 
planned use of both measurements . 

For the purposes of this study 9 a test situation was 
established in which 550 journal articles concerned with 
topics in the field of library and information science were 
analyzed for indexable concepts by a group of indexers whose 
education and work experience had been in this field. 




142 

143 



Each article was analyzed by 5 people, a total of 
2750 analyses in all. The verbal labels that these indexers 
created to characterize the concepts they perceived in the 
article were then examined'. 

The verbal labels were examined in order to establish; 

1. The degree of replication in the terminology used to 
characterize the concepts the analysts had perceived in the 
text; and 

2. The degree of replication in the concepts perceived. 

This was done by: 

1. A word-for-word comparison of terminology (in accordance 
with a definition of "match” in terminology as given in Chap- 
ter III) ; and 

2. The establishment of concept categories based on synonymy 
and the mathematical concept of the fuzzy set (also described 
in Chapter III) . 

Similar mathematical formulas were used to arrive at 
separate measures for the degr-.e of inter-indexer consist- 
ency in perception of concepts and the degree of incer- 
indexer consistency in choice of terminology with which to 
describe the concepts perceived. 

The objective was to discover whether there would be 
a salient difference between the degree of inter-indexer con- 
sistency in perception of concepts and the degree of inter- 
indexer consistency in choice of terminology with which to 
characterize the concepts perceived. 

The statistical findings of this study show that 
there is a material degree of difference between the consist- 
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ency with which these analysts perceived indexable matter in 
the texts analyzed and the degree of consistency or replica- 
tion in the terms with which they characterized or communi- 
cated these concepts. Degree of consistency in choice of 
concept was always significantly higher than degree of con- 
sistency in choice of terminology. In 500 of the 550 arti- 
cles , it was 21.0 percentage points or more higher. 

Scores of mean inter-indexer consistency in choice of 
terminology ranged from 0.0 $ to 30 . 0 $. There were l8l arti- 
cles, at least 2 in each packet, for which the mean termi- 
nology consistency was 0 . 0 $. Of the 550 articles in the 
study, 512 had a mean terminology consistency score of 10 . 9 $ 
or less j only one had a mean concept consistency score as 
low as this. 

Scores of mean inter-indexer consistency in percep- 
tion of concepts ranged from 9.4$ to 84,0$.' Of the 550 ar- 
ticles in the study, 525 had a mean concept consistency score 
of 21 . 0 $ or more. Two hundred fourteen had a mean concept 
consistency score of 4l.0$ or more. 

Although it is relatively easy to establish criteria 
and define what is meant by "replication of terminology", 
establishing criteria and a definition of what is meant by 
"replication of concept" i3 comparatively difficult. This 
has not been consciously attempted in previous indexer con- 
sistency studies. The attempt to do so here does not repre- 
sent a situation unique to studies of indexing methodology, 
however. Studies of other aspects of indexing technology in 
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which attempts were made to establish concept-based catego- 
ries ci synonymous terms, or of terms and their logical 
equivalents, have resulted in concept categories which en- 
compass terms w ith smaller degrees of relatedness than was 
required of the terms in the concept categories established 
for this study. In addition, even though exact replication 
of concept categorizations by different categorizers was not 
expected, the results of this study show that substantial re- 
plication of the pattern of statistical results of categori- 
zations done by different categorizers may be expected. 

The findings of this study lead to the conclusions 

that : 

1. The presently accepted definition of indexer consistency 
should be changed to include explicitly both indexer consist- 
ency in perception of concept and indexer consistency in 
choice of terminology (overall indexer consistency); 

2. Measurements of indexer consistency should be composed 
of either 

a. Two scores: consistency in perception of concept 
and consistency in choice of terminology, or 

b. One score in which both of these measures are 
consciously included, with each, perhaps, being weighted sep- 
arately. 



Implications of this Study 
The focus of this study has been on problems of in- 
dexer consistency. It3 thesis is based on the fact that 
previous work on the definition of indexer consistency and 
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the construction of quantitative measures for indexer con- 
sistency have not formally differentiated bet ween the effects 
of two basically different variables: consistency in the 
choice of indexable concepts in the text, and consistency in 
the verbal expression of the concepts so distinguished. 

It is important to note that a measure of indexer 
consistency that combines these two variables without differ- 
entiating them may be quite valuable to the user or producer 
of a particular index. However , in investigations of the 
problem of indexer consistency outside the context of a spe- 
cific working situation, it seems reasonable to try to ap- 
proach the problem in relation to a more general indexing 
methodology and theory, the type of methodology and theory 
exemplified by the descriptions of the indexing process that 
have been cited earlier. This is what has been done here. 

The inter-indexer concept consistency scores found 
in this study compare well with those of previous studies 
which stated that they measured terminology consistency, but 
which actually measured an undifferentiated ’’indexer consist- 
ency” including both consistency in terminology and consist- 
ency in perception of concept. 

The consistency scores given in studies in which a 
’’match” in terminology was defined fairly rigorously, ranged 
near the terminology consistency scores for this study. In 
studies which defined a match in terminology to include hier- 
archically related and synonymous terms and achieved a ’’match 
in terminology through fairly substantial regularization of 
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the terminology, the consistency scores were roughly compa- 
rable to the concept consistency scores for this study. 

It seems evident that in almost all reported indexer 
consistency studies, the designer of the study felt that an 
exact, character-f or-character match in terminology was not 
a ’’satisfactory” measure of indexer consistency. In one 
sense, this may represent an attempt to allow for terminolo- 
gical or verbal inconsistency in expressing consistently 
identified concepts, though this idea is not so expressed in 
any of the studies cited. 

Writers of previous studies who stated that they 
defined indexer consistency as consistency in choice of ter- 
minology, seemed not to be satisfied with a rigorous defini- 
tion of ’’match” in terminology, but modified their defini- 
tion to include varying degrees of ’’match”, some of which 
were based on synonymy or hierarchal relationships. This 
partly accounts for the wide variation in their statistical 
findings and also accounts for the difficulty other investi- 
gators have found in trying to use their results as the basis 
for further research. 

It is probable that if the analysts in this study 
had been given a list of terms, each of which precisely and 
unambiguously defined a concept in the articles they were 
asked to analyze, and had been required to use these terms 
to characterize the concepts they perceived, that the scores 
for terminology consistency would have been higher. No list 
of terms was given to them and they were explicitly instructed 
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that the verbal labels they created did not have to conform 
to any standard list of terms, although, if the analysts 
felt that it was appropriate, they could use terms from a 
standardized list or in a standardized form. No one actually 
used a standardized list, but it seems likely that remembered 
standardized forms of terms were used. 

Vocabulary control, as exemplified by lists of terms 
authorized for use in a given system, is one of the method- 
ological tools used to standardize index uermanology . Vocab- 
ulary control may or may not have an effect on consistency 
in indexers' choice of terminology. However, if there is a 
list of authorized terms from which the indexers must choose, 
the probability of their choosing matching terms (however 
this is defined) would seem to be increased. The effect, if 
any, that a list of authorized terms would have on consist- 
ency in indexers' perception of concepts is as yet unknown. 

In the study reported on here, no attempt was made to 
relate index quality to indexer consistency. The relation- 
ship, if any, between these two aspects of indexing has 
not been objectively established as yet and no attempt 
is made to do so here. Likewise, there was no attempt 
to distinguish between "significant" terms and concepts 
and "non-significant" terms and concepts. Indeed, there was 
no attempt to distinguish between what should or should not 
have been considered indexable matter for each text, and 
therefore, no judgements were made as to the quality of the 
indexing. The major objective of the study was simply to 
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record, compare, and analyze the concepts and the words used 
to record the concepts, that were perceived in the texts of 
the articles by the analysts employed in the study. 

Implications of this Study for Thesaurus Construction 
and for Instructions to Indexers for the Use of Thesauri 

Thesauri are lists of terms acceptable in a given in- 
formation system, or terms perceived as appropriate for a 
given subject area. They also may contain definitions of the 
terms listed, scope notes, and a syndetic (cross reference) 
apparatus for the display of relationships. In practical use, 
thesauri also often serve the function of outlining and delim- 
iting the concepts that are perceived by the makers and users 
of the thesaurus as lying within the area covered by the in- 
formation system of which the thesaurus is a part. 

A concept represented by a term in a thesaurus auto- 
matically becomes, in the mind of the indexer, an indexable 
concept for the information system of which the thesaurus is 
a part. The reverse of this, that a concept not represented 
by e. term in the thesaurus will automatically not be per- 
ceived as an indexable concept, may or may not be true. In 
information systems where the indexer may add terms to the 
list of terms in the thesaurus relatively freely, this is al- 
most certainly not true. But the extent to which, if at all, 
the listing of terms in a thesaurus affects the indexers ? per- 
ception of what concepts in a text are indexable concepts is 
an area as yet unexplored. Will a "peripheral" concept be 
perceived more readily if it is represented by a term in the 
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thesaurus? Will "new” concepts * those as yet not represented 
by a term in the thesaurus, be perceived more slowly because 
they are not listed? 

Thesauri also often cause a sort of "pigeon-holing" 
effect. That is, indexers attempt to fit the concepts they 
perceive in a text into the pigeon-holes established by the 
terms in the thesaurus. They perceive a concept and then 
try to find a term in the thesaurus with which to character- 
ize it. Thus, there may be some loss in the accuracy or 
exactness with which a concept is characterized, tut there 
is likely to be a gain in overall terminological consistency 
for the information system of which the thesaurus is a part. 

Instructions to indexers on how to use a particular 
thesaurus (if written instructions are given) usually are 
concerned with application of the terms in the thesaurus and 
use of the syndetic apparatus. Instructions are usually 
scanty and there are usually no explicit rules defining what 
kinds of concepts should or should not be considered a? in- 
dexable. In most thesauri, the only rules given (if any are 
given aside from the syndetic structure itself) are rules as 
to how to choose terms with which to label concepts, once 
they have been perceived as indexable, and how to structure 
the terms once chosen. 

This study has demonstrated that the indexing proc- 
ess may be separated into the components: 1) perception of 
indexable concepts; 2) expression of those concept’ -f ». words. 
These two aspects of the indexing process should be considered 
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separately in the construction of a thesaurus, and they of- 
ten are. Information scientists or librarians working to 
build a list of authorized terms, attempt to provide terms 
for all the concepts they perceive as being relevant to the 
areas to be covered by the thesaurus, and also to establish 
and define terms expressing these concepts that will allow 
for effective and efficient indexing. Instructions on the 
use of the terms in the thesaurus should also refer to both 
components of the indexing process. The effect that thesauri 
themselves or instructions to indexers on the use of thesauri 
might have on indexer performance in either of the above com- 
ponents of the indexing process is an area in which more re- 
search is needed. 

Implications of this Study for Indexing Research 

It is hoped that this study will help re -focus the 
attention of research workers and other personnel in library 
and information science on the importance of concepts in the 
process of indexing. Much of the recent research in indexing 
has concentrated on the grammar, morphology, linguistic, and 
statistical relationships of terms and not on the concepts 
represented by the word, phrase, or sentence. 

One word can have many shades of meaning; one "mean- 
ing" can be characterized by many verbal labels. "Language 
enters into . . . conceptual representation only in a naming 
capacity. . . ." 1 This study has shown that there will be 

•^ Roger C. Sc hank. The Use of Conceptual Relations in 
Content Analysis and Data Base Storage (Austin, Texas: Tracor 
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more agreement on what concepts have been discussed in a 
text than would be obviou.. simply from the words used to com- 
municate those concepts. One cannot assume , as some investi- 
gators have , that separate words or phrases chosen from the 
matrix of a sentence will adequately represent the informa- 
tion content of that sentence. 

It is a mistake to assume that a word, or a phrase, con- 
tains information in the same sense in which a statement 
does .... the information content of a statement is 
not the sum, or combination, of th§ information content 
of its constituent phrases . . . . t - 

The words or phrases in the sentence, if taken one by one, 
may result in a different informational ''meaning” than if the 
sentence had been considered as an organic whole with the re- 
lationships of the concepts that the words represent still 
intact „ 

Although, in many instances, in indexing, we destroy 
the relationships between concepts in a text when we estab- 
lish separate terms for each concept, it is still the con- 
cept that bears the meaning, not the words used to label the 
concept . 

An example of this is the homograph. For instance, 
the word "abstract". It can represent many different con- 
cepts : 

1. A theoretical, non- pragmatic, or non-concrete entity; 

2. An abstruse entity not easily understood; 

2 Y. Bar-Hillel, "A Logician’s Reaction to Recent 
Theorizing on Information Search Systems," American Docijne no - 
tation VIII (April 1957 ): 105 - 6 . 
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3. A statement summarizing the important points of a given 
text; 

4. The concentrated essence of a larger whole; 

5 . An entity thought of or stated without 'reference to a 
specific instance or application; 

6 . A genre of painting . 

The form of the word "abstract" (its spelling) does not 
change with the change in meaning. 

Therefore, in addition to research on the frequency 
or structure of the physical word, phrase, or sentence, it 
would seem that research on the concept, the indexable con- 
cept, should be pursued. 

What distinguishes an indexable concept from a non- 
indexable concept? How do indexers perceive indexable matter? 
Can the concept "an indexable concept" be defined? Can it be 
defined in the abstract or may it only be defined in the con- 
text of an actual indexing situation? 

Can rules and definitions be established that will 
act as guidelines to indexers in the choice of indexable mat- 
ter and will these rules make indexers more consistent (pre- 
dictable) in the kinds of concepts they perceive as index- 
able? | 

All of these questions havje been posed before. The 

investigation reported here makes clearer the potential value 
of such studies. 
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Implications of this Study for Tests of Indexing 
Language .and In do nine; System. Eff eetivonosG and Efficiency 

Tests and comparisons of indexing systems like those 
in the Cleverdon studies^, those reported in Lancaster and 
Mills , J.A. Schuller's study^ or some of the more recent 
studies evaluating published indexes reported by Lancaster 
and Gillespie^ , seem to show that indexing systems differ in 
effectiveness or efficiency by a comparatively small degree. 
If this finding is provisionally accepted as fact, is it not 
reasonable to suppose that inter-indexer inconsistency in 
perception of concepts, in conjunction with the already rec- 
ognized phenomenon of inter-indexer inconsistency in termi- 
nology would have an effect great enough to influence these 
results significantly? 

It is interesting to note that Cleverdon clearly 
recognized the indexing process as being composed of the two 
steps that form the basis for this study. He states, of the 
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3cyril Cleverdon, ASLIB Cranfield Research Project; 
Report on the Testing and A nalysis of an Investigation into 
the Comparative Efficiency of Indexing Systems (Cranfield, 
England: College of Aeronautics, October 1962) 



^F.W. Lancaster and J. Mills, "Testing Indexes and 
Index Language Devices: the ASLIB Cranfield Project," Ameri - 
can Documentation XV (January 1964); 4-13. 



5j.a. Schuller, "Experience with Indexing and Re- 
trieving by UDC and Uniterms," ASLIB Proceedings XII (Novem 
ber I960): 372-89. 



6 f.W. Lancaster and C.J. Gillespie, "Design and Eval- 
uation of Information Systems," Annual Review of Information 
Science and Technology. V (1970): 53~57 . 
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second step of the indexing process, "if the concept is cor- 
rectly translated *r,t 0 the descriptor language, it is capable 

jrp 

of being retrieved whatever descriptor language is used, 1 ' 1 

Cleverdon was testing the "operating efficiency" of 
the indexing systems he investigated. He did not intend to 
concern himself with the first part of the indexing process 
(perception of concepts). However, because of the gross 
statistical differences found in the study reported on here 
between indexer consistency in perception of concepts and 
overall indexer consistency as expressed in consistency in 
terminology, it would seem necessary that future tests of 
indexing systems should consciously include indexer percep- 
tion of concepts as one of the variables in the investiga- 
tion. Certainly the differences between the retrieval capa- 
bilities of the systems Cleverdon studied were statistically 
small and might have been significantly different if indexer 
consistency had been one of the variables in the study. 

Implications of this Study for the Improvement of 

Indexing Methodology 

The most important implication of this study is that 
the indexing process is indeed a two part, order-dependent 
process. It is possible to distinguish between these parts 
and examine each independently of the other. Since they are 
order-dependent, the first step, the identification of index 1 
able concepts, provides the foundation on which the second 

^Cleverdon, op. cit., p. 97 • 
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stop, the choice of terminology , rests. It is possible to 
improve the level of consistency for the second step without 
improving the level of consistency of the first step. How— 
ever* improvement of the level of consistency in the first 
step would have the effect of raising the attainable level 
of consistency for both steps. Since these two steps are 
order-dependent , the level of consistency of the second step, 
the choice of terminology, can not be higher than the level 
of consistency of the first step, the perception of indexable 

matter. At best, they may be equal. 

If we could be sure that indexers would display per- 
fect consistency in their choice of terminology (100# con- 
sistency in choice of terminology) the overall consistency 
with which they could assign index terms to a given text 
would still depend on the consistency with which they per- 
ceived the indexable concepts in that text. For example, 
hypothetically, let us say that for a given text, there are 
20 indexable concepts that might be perceived by an index', i. 
If indexer A perceived concepts 1-10 and indexer B perceived 
concepts 11-20, the inter-indexer consistency in perception 
of concepts would be 0.0# although each would have perceived 
50.0# of the concepts in the text. Their consistency in ter- 
minology would likewise probably be 0.0# since they would not 
be characterizing the same concepts. 

Now, let us suppose, that of the 20 possible index- 
able concepts in the text, indexer A perceives 15 and index- 
er B perceives 15. They each perceive concepts 1-10 but in 
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addition, indexer A perceives concepts 11-15 and indexer B 
perceives concepts 16-20. Using the formula for concept 
consistency described in Chapter III of this study, the 
inter-indexer consistency in choice of concept would be 
10/20 or 50.0$. If indexer A and indexer B each used the 
same terms to label the concepts they had perceived in com- 
mon, the inter-indexer level of consistency in choice of ter- 
minology for the text could still only reach 50.0$ since 
there would always remain the 50$ of the concepts in the text 
that had been perceived by one but not the other. 

If, however, they had attained 75*0$ inter- indexer 
consistency in perception of concepts, that is, ea,ch had per- 
ceived concepts 1-15, but indexer A had additionally perceived 
concepts 16-18 and indexer 3 had perceived concepts 19-20, 
the attainable level of consistency in choice of terminology 
would likewise have been raised to 75*0$. This is one reason 
why more research on indexer perception of concepts in texts 
Is necessary. Raising the level of step one raises, by defi- 
nition, the attainable level for step two. 

There is another aspect to this problem that deserves 
mention here, also. Let us again suppose a hypothetical 
situation in which there are 20 indexable concepts in a given 
text. Let us suppose that indexer A perceives concepts 1-10 
and indexer B also perceives concepts 1-10. They are 100$ 
consistent in their perception of indexable matter. Let us 
also suppose that indexer A and indexer B each choose ’’match- 
ing terms” to characterize the concepts they perceive. They 
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ehieved 100^ consistency in terminology. However , 
remains of tlie possible indexable concepts m the 



text., concepts 11-20 ^ that have been neither perceived nor 
expressed by these indexers. They would not have provided 
index access points for concepts 11-20. A concepu that is 
not perceived as indexable can not^ by def inition^ be assigned 

an index term. 

A user requiring information on concepts 11-20 would 
have no way of knowing that this text contained information 
on them. The attainable level of indexer-user consistency # 
(an area not investigated in this study) could not be higher 
than 50.0^ even though inter-indexer consistency would be 
10 q$. If indexer consistency in perception of concept could 
be raised., it may be assumed that attainable indexer— user 

consistency would be improved as well. 

The problem of inter— indexer, Intra-indexer ^ and in- 
dexer-user consistency in the perception of concepts in texts 
is a problem that is still relatively unexplored. This may 
be because the problem of indexer consistency has not before 
been overtly separated for study into its two components * 
consistency in perception of concept and consistency in 




158 

159 



* 

* 3 » 



i 

I 



/ 






choice of terminology. 



APPENDIX A 

LIST CF PREVIOUS INDEXER CONSISTENCY STUDIES 



I 



/ 



V 




159 

160 



PREVIOUS INDEXER CONSISTENCY STUDIES 



Borko, Harold. "Measuring the Reliability of Subject Clas- 
sification by Men and Machines." American Documen - 
tation . XV (October, 1964): 268-73. 

Bryant, E. C. Control of Indexing Errors, . Denver: Westat 
Research Analysts, Inc., 1965. 



3 King, D. W. , and Terragno, P. J. Analysis of an 

Indexing and Retrieval E xperiment for the Organo - 

metallic File of the U.S. Patent Office, . Denver: 
Westat Research Analysts, Inc., 1963. 

Cooper, William S. "Is Interindexer Consistency A Hob- 
goblin?" American Documentation , XX (July, 1969): 

268-78 • 

Harris, D., Rayward, W. B., and Svenonius, E. The Testing 
of Tnter-Indexing Consistency at Various Indexing 
Depths. Chicago: University ^_of Chicago Graduate 
Library School, February, 1966. 

Hooper, R. S. Indexer Consistency Tests Origin, Me as ure_- 

merits , Results sind Utilization * Bethesda^ Md.; 

IBM Corporation," 1965. 

Hurwitz, F. I. "Study of Indexer Consistency." American 
Documentation , XX (January, 1969): 92-4. 

Jacoby, J., and Slamecka, V. Indexer Con sistency Under 

Minimal Conditions. Bethesda, Md.: Documentation, 

Inc., 1962 T (AeT288o87) . 

Jaster, J. J., Murray, B. R., and Taube, M. State of the 
Art of Co-ordinate Indexing . Washington, D.C.: 
Documentation, Inc., 1962 . (AD 275393) • 

Korotkin, A. L., and Oliver, L. H. Thejffect gQubJeet 
Familiarity and the Use of an Indexing A id Upon 
Inter-Indexer Consistency . Bethesda, Md.: 

General Electric Co., 1966 . 

and A Method for Computing^ Indexer 

Consistency . Bethesda, Md.; General Elec- 
tric Co., 1964. 



160 



161 



* Oliver* L. H.* and Burgis* D. H. Indexing Aids , 

Procedures* and Devices . Bethesda, Md.: General 
Electric Co., 190 is. 

Kyle, Barbara. Consistency Analysis of Two Indexers 

Using K. C. for political Science Material . 

London : National Book League* i' 962 . 

Macmillan, J. T.* and Welt* I. D. "A Study of Indexing 
Procedures in a Limited Area of the Medical 
Sciences." American Documentation , XII 
(January, 1961 ) : 27-31 • 

Mullison, W. R. and others. "Comparing Indexing Effi- 
ciency, Effectiveness* and Consistency With or 
Without the Use of Roles." In American Society 
for Information Science. Proceedings . Vol. VI. 
Westport, Conn. : Greenwood* 19b9. 

Painter, Ann F. Analysis of Duplication and Consistency 
of Subject Indexing Involved in Report Handling 
at the Office of Technical Services, U.S. Depart - 
ment of Commerce . Washington* D.C.: U.S. Office 

of Technical Services* Department of Commerce* 
1963. (PB 181501). 

Rayward* W. B.* and Svenonius* E. Consistency, Consensus 
Sets and Random Deletion . Chicago: University 
of Chicago Graduate Library School* 1967* 

Rodgers, D. J. A Study of Inter-Indexer Consistency . 

Washington* D.C.: General Electric Company* 196.1. 

, A Study of Intra-Indexer Consistency . 

Washington, D.C.: General Electric Company* 1961. 

St. Laurent* Mary Cuddy. A Review of the Literature of 
Indexer Consistency . Chicago: University of 

Chicago Graduate Library School* 1966 . 

Saracevic* T. and others. An Inquiry into Testing of In - 
formation Retrieval Systems. Part 1: Objectives , 

Methodology, Design, and Controls . Cleveland* 

Ohio : Case Western Reserve Un i ve r s i t y Center for 

Documentation and Communication Research* 1968 . 

Slamecka* V.* and Jacoby* J. E ffect of Indexing Aids on 
the Reliability of Indexers . Bethesda* Md . : 
Documentation* Inc . * 1963 . 



161 



Tell, 3. V. "Document Representation and Indexer Con- 
sistency; a Study of Indexing from Titles, 
Abstracts, and Full Text Using UDC and Keywords." 

In American Society for Information Science. 
Proceedings . Vcl. VI. Westport, Conn.: 

Greenwood, 1969 *, 

Tinker, John F. "imprecision in Indexing, Part II." 

American Documentation , IXX (July, 1958) : 322-30. 

, Imprecision in Meaning Measured by Inconsistency 

of Indexing." American Documentation, XVII 
(April, 19 66): 96-102. 

Zunde, Pranas, and Dexter, M. E. "indexing Consistency and 
Quality." American Documentation^ XX (July,, 1969 ) : 
259-67. 

, and Dexter. M. E. "Factors Affecting Indexing 

Performance. ' In American Society for Information 
Science. Proceedings , Vol. VI. Westport, Conn.: 
Greenwood, 1969 ♦ 



162 



APPENDIX B 

BIOGRAPHICAL INFORMATION ON ANALYSTS 



163 



164 



5 



i 



) 



BIOGRAPHICAL INFORMATION ON ANALYSTS 

ANALYST’S NAME: .. 

EDUCATION: (Please check all that apply. ) 

Master’s degree in library science 

Master’s degree in other subject field ... 

Doctoral degree in other subject field 

Bachelor's degree only 

Undegraduate major study area was 

Graduate study was in the area of . 

Is this your first semester in library school? Yes 

No . 

WORK EXPERIENCE: (Please check all that apply.) 

Have you worked in a library or done library-related or 
"library type" work? Yes , No . 

If yes, how many years? Less than 1 , 1-3 » 

4 or more . 

What did the work Involve? 

Mainly clerical tasks 

Reference _ __ 

Cataloging and/or classification __ 

Administration 

Teaching 
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Research 

Subject analysis of written material 

Acquisitions 

Automation 

Circulation 

Indexing 

Abstracting 

Other (Please specify.) 
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INSTRUCTIONS FOR ANALYSTS 



For the purposes of this study, you will he given 
various journal articles to read. After you read each ar- 
ticle, you will he ashed to identify the concepts discussed 
in the article and write the name of the concept on a data 
gathering sheet. 

Imagine that you are analyzing the article for an 
information center and library containing material on in- 
formation science, documentation and librarianship. 

What you are being asked to do is to identify 
concepts in the article and write them down by name in the 
words you would ordinarily use to name the concept. They 
do not have to be the words used by the author. They do 
not have to conform to any established indexing language 
or system of subject headings. They should be words or 
phrases that you would use to identify the concepts in the 
article. For convenience, I call these words or phrases 
"verbal labels.” Verbal labels define a concept in words. 
Your objective should be to name all the concepts in each 
article on which useful information is given. 

Each verbal label should identify one concept 

only. 

Each concept should be characterized by a separate 
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verbal label. 



Each verbal label should reflect the exact concept 
in the article. For instance, if the article is about 
"Airedales”, you would use the verbal label "Airedales", 
not the verbal label "Dogs. " 

Many people feel it is possible to analyze an ar- 
ticle for concepts without being able to understand every- 
thing written in the article. In other words, you may be 
able to indicate what concepts are being discussed in an 
article without knowing what is being said a bout the ar- 
ticle whether or not you understand what is being said 
about the concepts. Of course, if you do not understand 
what concepts are being discussed, you may leave the data 
gathering sheet blank. Please be sure to indicate on the 
bottom of the data gathering sheet whether you understand 
the article completely, in part, or not at all. 

Please do nob write on the articles. Write on the 
data gathering sheets. 
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DATA GATHERING SHEET 
VERBAL LABELS FOR ARTICLES 

Analyst 1 s name : 

Journal article number: 

Subject labels: 



\ 



Please check one : 

I understood this article completely , in part 

not at all . 
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INSTRUCTIONS FOR CATEGORIZATION 

Place the five data gathering sheets for an ar- 
ticle in a folder. Label the folder with the number of 
the article. 

Read the first verbal label of the first analyst. 

Decide what concept categories need to be es- 
tablished for the concept (s) in the first verbal label. 
Establish them and write them out on the inside of the 
folder, for example: 

A Circulation systems 
B Mechanization 

Call these concept categories "A”, "B", M C", etc., 
in order, with no attempt to establish relationships 
between them. 

Read all the verbal labels created by all the 
analysts for the article. Decide which verbal labels, 
if any, contain concepts that might be placed under ca- 
tegory "A" and write M A" next to these verbal labels. 

Do the same for category "B M , M C M , M D", etc., creating 
new concept categories where necessary and returning to 
search previously searched verbal labels when a new ca- 
tegory has been established. 

Proceed until all concepts in all of the verbal 
labels for all of the analysts for the article have been 
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iven category labels and all verbal labels have been 
earched for all concept categories. 
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AUTOMATION-BIBLIOGRAPHY 


id 


l JLMO t\t\0 


SYS 1 enS MI'IML.COiO 1 W 
LIBRAP. IES-BIBLI OGRAPHY 


10 


1143 UTRKB 


GENERAL ANO 
MISCELLANEOUS 
BIBLIOGRAPHY OF ARTICLES 






DEAL I NG WITH LIBRARY 
AUTOMATION, TO 1967 


10 


1143 TURKDB 


BIBLIOGRAPHY OF ARTICLES 
DEALING WITH 

ACQUISITIONS ASPEC TS OF 
LIBRARY AUTOMATION, TO 
1967 


10 


1143 TURNBA 


BIBLIOGRAPHY OF ARTICLES 
DEALING WITH LIBRARY OF 
CONGRES S AUTOMATION 
PROJECTS, TO 1967 


10 


1143 TURKEB 


BIBLIOGRAPHY OF ARTICLES 
DEALING WITH CATALOGING 
ASPECTS OF LIBRARY 



AUTOMATION TO 1967 



234 

235 



1C 



10 



10 



1143 URGE 



1143 TURK I B 



1143 TURJ 



BIBLIOGRAPHY 
DEALING WITH 
CATALOGS. TO 
BIBLIOGRAPHY 
DEALING WITH 
ASPECTS OF 
AUTOMATION, 



OF ARTICLES 
BOOK 
1967 

OF ARTICLES 
SERIALS 
LIBRARY 
TO 1967 



BIBLIOGRAPHY 
DEALUG WITH 
ANALYSIS, TO 



OF ARTICLES 
SYSTEMS 
196 7 



14 1143 RKB 



LIBRARY AUTOMAT ION . 
BIBLIOGRAPHY 



7 1143 RKB 



LIBRARY AUTOMATION, 
BIBLIOGRAPHIES 



S. I960 

12 1143 SRKB 



LIBRARY AUTOMAT IGN- 
BIBL I 0GRAPHY-I960 



T. BIBLIOGRAPHY OF ARTICLES 



10 1143 UTRK3 



10 1143 TURKD3 



10 1143 TURN3A 



10 1143 TURKEB 



GENERAL ANO 
MI SCELLANEOUS 
BIBLIOGRAPHY OF ARTICLES 
DEAL I NG WITH LIBRARY 
AUTOMATION, TO 1967 
BIBLIOGRAPHY OF ARTICLES 
DEALING WITH 

ACQUISITIONS ASPEC TS OF 
LIBRARY AUTOMATION, TO 
1967 

BIBLIOGRAPHY OF ARTICLES 
DEALING WITH LIBRARY OF 
CONORS S S AUTOMATION 
PROJECTS, TO 1967 
BIBLIOGRAPHY OF ARTICLES 
DEALING WITH CATALOGING 
ASPECTS OF LIBRARY 
AUTOMATION TO 1967 



235 

o 
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10 



1 143 TURGE 










BIBLIOGRAPHY OP ARTICLES 
DEALING WITH BOOK 
CATALOGS, TO 1967 
10 1143 TURKIB BIBLIOGRAPHY OF ARTICLES 

DEALING WITH SERIALS 
ASPECTS OF LIBRARY 
AUTOMATION, TO 1967 

10 1143 TURJ BIBLIOGRAPHY OF ARTICLES 

DEALING WITH SYSTEMS 
ANALYSIS, TO 1967 



U. UP TO 1967 



10 1143 UTRKB 

10 1143 TURKDB 

10 1143 TURN3A 

10 1143 TURKEB 

10 1143 TURGE 

10 1143 TURKIB 

10 1143 TURJ 



GENERAL AND 
MISCELLANEOUS 
BIBLIOGRAPHY OF ARTICLES 
DEAL I NG WITH LIBRARY 
AUTOMATION, TO 1967 
BIBLIOGRAPHY OF ARTICLES 
DEALING WITH 

ACQUISITIONS ASPEC TS OF 
LIBRARY AUTOMATION, TO 
1967 

BIBLIOGRAPHY OF ARTICLES 
DEALING WITH LIBRARY OF 
CONGRES S AUTOMATION 
PROJECTS, TO 1967 
BIBLIOGRAPHY OF ARTICLES 
DEALING WITH CATALOGING 
ASPECTS OF LIBRARY 
AUTOMATION TO 1967 
BIBLIOGRAPHY OF ARTICLES 
DEALING WITH BOOK 
CATALOGS, TO 1967 
BIBLIOGRAPHY OF ARTICLES 
DEALING WITH SERIALS 
ASPECTS OF LIBRARY 
AUTOMATION, TO 1967 
BIBLIOGRAPHY OF ARTICLES 
DEALING WITH SYSTEMS 
ANALYSIS, TO 1967 
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APPENDIX G 



STATISTICAL TA BLES OF FINDINGS FOR PACKET'S VIII, IX, AND X 
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TABLE V - 1 
PACKET VIII 



PERCENTAGES OF CONSISTENCY 



ARTI- 

CLE 

NUM- 

JBER 


PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


"Ti 

TERMIN- \i 
OLOGY ( 

CONSIS- ( 

TENCY |< 


VRITHMETIC 
yffiAN OF (l 

CONCEPT Si 

CONSISTENCY* 

3F ALL |< 

PAIRS »: 


VRITHMETIC^ 
:VEAN OF TER- 
MINOLOGY 
CONSISTENCY 

DF ALL 

PAIRS 


j CObB" 


AB and KB 1 


40.0%~'~ 


O.Oft ... 






AB and KC 


42.9 _ 


0.0% 






I 


AB and IiK 


9.1g 


0. 0% 






| 


AB and BM 


62. 5%I_ . 


0.0% 






| 


KB and KC 


33.3ft . 


o7o% 






j 


KB and HK 


*0 

O 

• 

O 


oToft 1 






I 


KB and BM 


36. 4% 1 


o7o% 








KC and HK 


25.0ft J 


o7o% 


, 






KC and BM 


37.5ft 


0.0% 








HK and BM 


1875ft 


16.7ft 


34.5ft 


'1.7ft 










0091 


AB and KB 


... 42 ,_9ft 


0.0 % 






AB and KC 


33.3% 


0.0% 








AB and HK 




Q * — 








AB and BM 


28! 6ft i 


oToft 








KB and KC 


42 . 9ft 


o.of 








KB and HK 


1 

O' 

0 

• 

0 


0.0% 








KB and BM 


23 . 1 ft 


<37oft 








KC and HK 


25 . 0 %“ 


5375ft 






j 


KC and BM 


20. 0)o 


7.1ft ” 






1 


HK and BM 


23.1ft 


O.Oft I 


34.2ft 


2.0% 


| 




‘ 


: 


0123 


AB and KB 


40.0ft 


O.Oft 






AB and KC 1 


23.1ft 


O.Oft ' 








AB and HK 


- ti.3SL_ 


O.Oft 








AB and BM 


20.0% 


0.0% 


* 






KB and KC 


9.1gL 


0.0% 








KB and HK 


12 . 5 ft. 


0.0% 








KB and BM 


MO. 0% 


O.Oft 








KC and HK 


10.0% 


O.Oft 


* * 






KC and BM 


6.7ft__. 


0»0§— 








HK and BM 


8.3ft 


O.Oft 


17. 8% 


O.Oft. 


0207 


AB and KB 


33. 3ft_ 


0. 0 j-- 






AB and KC 


30.0ft 


0 . vf- 








AB and HK 


60. 0?L_ 


0.-0%. - 


* 






AB and BM 


40. Oft 


0.0%— 


• 






KB and KC 


37._5%_. 


ldzi— 


N 






KB and HK 


4 2^ft_ 


26.0% - 








KB and BM 


33^%I 


O.Oft 








KC and HK 


57^1 21 


20.0% . 


■ ' 






KC and BM 


30 .ojT 


0.0% 








HK and BM 


50. Oft 


O 

• 

0 

r"* 




c 0 nf 










I 40, 4% 


bo2yo 
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TABLE V - 1 
PACKET VIII 



PERCENTAGES OF CONSISTENCY 





ARTI- 

CLE 

NUM- 

jBER 


PAIRS OP 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


TERMIN- % 
OLOGY j ( 

CONSIS- :{( 
TENCY 

' ] 


iRITlitfETIC t 
4EAN OF E 

CONCEPT |t 

jonsistehg '4 
)F ALL [( 

PAIRS 


ARITHMETIC I 
4 EAN OF TER- 
MINOLOGY 
CONSISTENCY 
DF ALL 
PAIRS 


0302 


AB and KB 


35 . 3 %“ 


0.0% 








AB and KC 


11 . 8S _ _ 


0. OS — ; 








A R an d HK 


11. IS 


o. QS 








AB and BM 


31 . 6 S . 


Q-oS 








KR ^ n ti KC 


1.8.2% , 


Q-Q% ... 








JTR find HK 


27 . 3 %. . 


0. oS_ 








KR find BM 


35 . 7 % 


■O-OS 


* 




VP. and HK 


18.2S ' 


Q-OS 






: 


KC find JELL 


28- 6% . 


0. OS 








HK find BM 


26 - 7 % 


20. OS 


) 1 ^ ryf 


'2.0% 


0316 1 


AB and KB 


43 ._ 8 %J 


0.0S.._| 






AB and KC 


20.0 % 1 


O-Oj— 








AB and HK 


31 . 3 % 


0.0% 








i AB and BM 


33 . 3 %. 


0.0% 








t KB and KC 


30 . 0 % 1 


0.0% 








KB and HK 




lb.7S.-I 








KB and BM 


iB.8% 


0 . 0 % 






! 

> 


KC and HK 


10.0^1 


0.0$ 








KC and BM 


16 . 7 % ~1 


0.0% 








HK and BM 


oToTI 


0.0% 


24 . gS . 


1 . 7 % 


03 52 


AB and KB 1 


33.3^ 1 


O.QS — 






AB and KC 


71.4SJ 


O.oCm 








AB and HK 


■ 45.5^ j 


0.0% 1 








AB and BM 


33 . 3 f~ 


1 0.0% 








KB and KC 1 


42 . m 


1 0.0% 








KB and HK 


85 . m 


0.0% 








KB and BM 


25.0% 


0.0s . 








KC and HK 


BB. 6 $ 


0 . 09 K 


t 9 






KC and BM 


42 . 


20 . 0 % 








HK and BM 




1 20 . 0 % 


46 . 1 % 


4.0% 


0414 


■ AB and KB 


inr 


T 070 ^ 


, 




AB and KC 


12.1s 


1 0 • 0 % 






I 


AB and HK 


24 . 1 % 


| 0 . 0 % 


* 




■ 


AB and BM 


24 . 1 % 


1 o 7 o% 


* 






KB and KC 


otot: 


1 6.0% 








KB and HK 




1 0 . 0 ^ 








KB and BM 


9.1S1 


1 0 . 0 % 








KC and HK 


54.0s 


1 21 . 2 % 


» 1 






KC and BM 


46 . 4 “ 


| 31 . 0 % 








HK and BM 




1 29 * 


27 . 4 ^ 


8.2% 






I 



* 



» 

I 
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TABLE V - 1 
PACKET VIII 



PERCENTAGES OF CONSISTENCY 



ARTI- 

CLE 

NUM- 

BER 


PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


15 

TERM IN- 
OLOGY ' c 

CONS IS- ( 

TENCY ; ( 

Jj 


ARITHMETIC h 
4E AN OF ? 

CONCEPT I 

20NSISTENCYU 
DF ALL ( 

PAIRS __ ] 


ARITHMETIC 1 
4EAN OF TER-' 
4IN0L0GY 
CONSISTENCY 
DF ALL 
PA TRS — 


0433 


AB and KB 


2R.O^ 


oToi 


36 . 5 % 


■ 3 . 3 % 




A'R a.nd KC 


3Z*5& 


0.0%, 




/VR an ri HK 


42._95_ 


0.0% — 




A B ann BM 


RO. 0?J — 


0.0% 




KB and KC 


10o0%\ 


0.0% 




KB and HK 


11.1% _ 


0o0% 




KB and BM 


IS. 5% . 


o7o% 




KC and HK 


5K M 


33.3% 


KC and BM 


42 .9%T 


0.0% 




HK and BM ' 


BOoO % 


0o0% 


050^1 

i 


AB and KB 


80 0 0 % 


' 0.0% ' 




J 


AB and KC 


33 . 3 % 


0„0% 




1 




AB and HK 


37.5% . 


0o0% 


• 


I 




AB and BM 


8373% 


o7S% i 




I 




KB and KC 


37.5% 


20 o 0% 




I 




KB and HK 


42 . 9/o * 


25 o0% 


> ‘ 


I 




KB ancTEfT 


bb .7% • 


0 0 0 % 




I 




KC and RKT 


b2 . b% 


40 . 0% 




I 




KC and BM 


3o.o?n 


Q o 0% 




I J 




HK and BM 


33 ,3%. J 


0.0% 


30.7% 


! 8 - 5 ^ -1 


0526 


AB and KB 


50.0%_ 


oToi 




1 


AB and KC 


27.3% 


0.0% • 




1 




AB and BM 


30.8% 


0.0% 




1 




AB and HK 


18.2% 


0 . 0 tfo 




1 




KB and KC 


10.0% 


0 M 




1 




KB and BM 


55 . 6 % " 


14.3% ” 




1 




KB and HK 


4279 Wl 


0% 




1 




KC and BM 


0 . 3 ^ 


0 0 oi 


» 


1 




KC and HK 


12 . 5 % 


2o.o$r 




1 




BM and HK 


' 33 . 3 % 


0 o 0% 


28 . 6 % 


3 .¥ 


“551 


AB and KB 


57 . 1 % 


0.0% 




I 


AB and KC_ 


28. b% 


0.0% 




I 




AB and HK 


' 42.9% 


0.0% 


* 


I 




AB and BM 


18.2% 


0.0 * 


« 


I 




KB and KC 


3373% 


0.0% 


s 


I 




KB and HK 


62 . 5% 


20.0^ 




1 




KB and BM 


45.5%I 


0.0% 




I 




KC and HK 


37 Til 


0.09%: 


' • 






KC and BM 


4o.o% 


0 . 0% 








HK and BM 


3b .4% 


0 .0% 


40.2% 


2.0% 
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TABLE V - 1 
PACKET VIII 



SARTI- 

CLE 

NUM- 

BER 


PAIRS OF 
ANALYSTS 


CONCEPT 
CONSIS- , 

TENCY 


[ 

TERMIN- • 
OLOGY J 1 

CONSIS- 1 
TENCY 

1 


0?c9 


AR and KB| 


66.7® — 


0.0% 


_ AR and KC 


1 50.0% 


0.0% 


AR and HK 


RO.O% 


0.0% A 


AB and BM j 


26.0%Z 


0.0% 


KR and KC * 


26.0% . 


0,0% 


TCR and HK 


1 10 . 0 % 


0 . 0 % 


KR and BM 


I 14. BS 


0.0% 


KC and HK 


so.o% 


q.i%- 


K C a n d BM 


I 4? . Q% 




~m and BM. 


10. 0% 


12.5% 









'C 



{arithmetic j 

MEAN OF TER- 
MINOLOGY 
[CONSISTENCY 
)F ALL 
fPATRS 



bo.4% 



5 . 9 % 
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PACKET VIII 
PERCENTAGES OF CONSISTENCY 



ARTI- 

CLE 

NUM- 

BER 



PAIRS OF 
ANALYSTS 



CONCEPT 

CONSIS- 

TENCY 



TERMIN- 

OLOGY 

CONSIS- 

TENCY 



raifffiJETIC 
J4EAN OF 
' CONCEPT 
• CONSISTENCY 
OF ALL 
PAIRS 



ARITHMETIC 
MEAN OF TER- 
MINOLOGY 
CONSISTENCY 
OF ALL 
PAIRS 




244 



243 




PACKET VIII 




/* 



X. 

""O' 



r 



PACKET XI 

PERCENTAGES OF CONSISTENCY 



ARTI- 

CLE 

NUM- 

BER 


PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


A/ 

TERMIN- k 
0L0GY 1( 

C0NSIS- ?( 

TENCY |( 

.1 


ARITHMETIC !/ 

•EAN OF 
CONCEPT i 

30KSISTENC3!:' 
)P ALL 1 

PAIRS | 


ARITHMETIC J 
4 EAN OF TER" 
4INGL0GY 
CONSISTENCY 

OF ALL 

PAIRS 


00%4 


SB and GII 


46 . 7% 


~ (77o£ 


33.5® 


0 . 0 ® 




SB and EL 


15.4®““ 


0 . 0 % 




SB and MS 


16 . 7§ 


0 . 0 % 




SB and K W 


2 5 . 0% _ 


0 . 0 ® 


! 

1 

) 


OB and EL 


21 . 4 ?; . 


0 . 0 % 


OB and MS ] 




0 . 0 % 


OB and KW 


30 . 8 % . 


0 . 0 % 


FT. ana MS 


28 . 6 $ 


oTo® 


F.T, and KW 


66.7% 


0.0$ 


MS and KW 


60 . 0% 


0.0% 1 








o 

o 

(X 

I-* 


SB and GH 


26 7!W 


0.0$ 


25.2®. . 


0 . 0 ® 


SB and EL 


40.0% 


" 0 % 


SB and MS 


it . w~ 


0 . 0% 


SB and W 


44 . 4/0 i 


0 • 0% 


GH and EL 


36.796.7 


0.0% 


OH and MS 


.. z.,z£_ 


0.0$ 


GH and KW 


20.6% 


0 % 


EL and MS 


12.9% . 


o7o$ 


EL and KW 


30 . vT 


0.0% 


MS and KW 


lC.3% 






* 




j 009 c. j 


i SB and GH 


42 




31.0% 


3 . 1 ® 


1 SB anc. TET 


— 27 . 8 M" 


U . U/o 




SB and MET 


HoT 


u .0% 


SB and KW 


17. 6i ~ 


o7W° 


OH and EL 


40.0%“ 


! 0 • 0% 


OH and MS 


22.2% 


1674% 


OB and KW 


31 7SZ 


0.0% 


EL and MS 


2b. ti 


0.0% 


EL and KW 


61. '3® 


0.0% 


MS and KW 


20.0% 


10,0% 








0099 


SR and GH 


27. 3®7 


lb. 2% 


S 


6 . 9 ® 


SB a.nd EL 


i 44 . 4 % 


11.1% 


SB and MS 


11.1% 


12 75% 


SB and KW 


77 .5% 


0.0% 


OH and EL 


1 57.li_ 


: ig-gC 


GH and MS 


14. 3^ 


14.3% 


GH and KW 


so . o?r 


0.0$ 


EL and MS 


16.7$ 


0.0% 


* • 


EL and KW 


60 . 0 ® 


0,0^ 




MS and KW 


25.0® 


0 • U'/a 


- sit 








J ^ • J/° 
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245 



PACKET IX 



PERCENTAGES OF CONSISTENCY 



ARTI- 

CLE 

NUM- 

BER 


PAIRS OP 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


I 

TERMIN - i 
OLOGY I 

CONS IS- ; 
TENCY 

1 

4 


ARITHMETIC [ARITHMETIC [ 
MEAN OF JMEAN OP TER- 

CONCEPT fnINOLOGY 

C ON SIS TEN CUC 0 N S I S TENCY 

OF ALL I0P ALL 

PAIRS (PAIRS 


1 0217 1 


SB and GH 


~ ~33T¥. 


0 . 0 % J 


20 . 1 $ 


2 . 2 $ 


SB and EL 




0 . 0 % 


SB and MS 


a) 

O 

• 

O 


0 . 0 % 


SB and KvI 


54*5134 


q.i$ 


GH and EL 


11-iJSL;. 


o.o$ 


GH and MS 


44 . 45 s . 


12 .5% : 


GH and KW 


OO 

1 

O 


oM.i_ 


EL and MS. 


14.. 3&. 





EL and KW 


11U&_ 


0.05S- ■ 


MS and KW 1 


55 ■ 6 % 


0 . 0 % 








0261 ] 


SB and GH 


30. 0$J 


0 . 0 % 








SB and EL 


27 . M 


OM 




• 




SB and MS 


42.9 % 


0.0% 1 








SB and KW 


20 . 0 % 


oTo% 








GH and HL 


15. m 


0 . 6 % 








GH and MS 


22 . 2 % 


0 . 0 % 






i 


GH and KW 


20.C%T 


0 . 6 $ 






EL and MS 


33.3^ 


“ 076% 








EL and KW 


lb 77fo 


075% 








MS and KW 


25 . 0 $ 


676% 


26.3^ 


0 . 0 $ 






* 




0299 


SB and GH , 


12 . 5 ^ 


oTS^ 








SB and EL i 


IO? 1 


575% 








SB and MS 


■ 12 .5 7 ° 


570% 








SB and KW i 


15 


575% 








GH and EL 


10 .0% 


575% 








GH and MS 


23.5'/ 


7.1% 








GH and KW 1 


19 .0*70 


575% 








EL and MS 


16 72% 


670% 








EL and KW 


21 . 4 % 


670^ 








MS and KW 


14 . 3 % 


575% 


16.6$ 


0.7$ 










0309 


SB and GH 


25.6$“ 


575% 








SB and EL 


ih.iW 


575% 








SB and" MS 


22 .2 Jo 


575% 


• 






i SB and KW 


inw 


67655 


• 






GH and EL 


33-3% 


575% 


\ 






GH and MS 


63 


7575% 








GH and KW 


'30 . 0$ 


9.1% 








EL and MS 


35.7$ 


5 » 0% 


‘ • 






EL and KW 


lb . b/o 


O . U% 








MS and KW 


27.3$ 


11". 1% 


O O C\(lt 


7 6% 










33»0/o 


1 •'J 70 
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PACKET IX 



ARTI- 

CLE 

NUM- 

BER 


PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


1 ’■ """ — [ 

TERMIN- ‘ 
OLOGY 
CONSIS- 
TENCY ;j 


ARITHMETIC^ 
;>IEAN OF f 

CONCEPT 1 

consistences: 

OF ALL 

PAIRS 


ARITHMETIC | 

LEAN 03" TER -| 

■1IN0L0GY 

CONSISTENCY 

)F ALL 

PAIRS 


“oTOO" 


S3 and W 


06 .7£> " 


77B# 1 








SB and EL 


£ o.o% 


0.0% 








SB and LIS 


60.09$ . 


7 .795 








SB and KW 


4S . 5% 


0-0% 








HR and EL.. 


63 .6«S_ 


O.OSL 








SB and MS 


G4.4% 


6-795 ; 








HR and KW 


60.0% 


O.C& i 


* 






■RT, and VS 


44.4% 


0-0% 


„ [ 






FT. and KW 


40.0% 


o.o% : 








IMS and KWH 


62 . 6% 


0.0% 


63-7% . 


LM 


l 








0420 j 


SB and GH 


40.095 


0.0% 






SB and EL 


lS.8% 


OM 








SB and MS 


26 .795 


0.0% 


• 






i SB and KW 


26 . 6 % 


0.0% 








| GH and EL 


40. OS 


o7o% 








GH and MS 


3B,5i_ 


0.0% 








GH and KW 


30 .M 


o7o% 








EL and MS 


60.0# 


0.0% 




' 




EL and KW 


bb.Tfo 


0 . 0 % 1 








MS and KW 


66.7% 


oTo% 


42.7% 


0.0% 






• 




-qJToT 


SB and GH 


50.0# 


070# 








SB and EL 


30 . 6 ?’ 


070% 








SB and MS 


27.3% 


07 0# 








SB and KW 


tlt% " 


070% 




• 




GH and Ll 


30.6# 


077% 








GH and 


20.0# 


0 . 0 # 








GH arid" - KIT 


40.0% 


0 . 0 % 








EL and ids 


30.0#""' 


070# 


» 






“ILL and KW 


23.1% 


• 070% 








MS and KW 


13 . 3# " 


070% 


28.6% 


0.0% 










T55& 


SB and Ud 


— 15 . b$ ’ 


070#“ 






SB and eL" 




070% 








SB and iiS 


573F: 


0 . 0 % 


* 






SB and TT 


1 ^ 75 % 


07o% 


• 






GH and 


57751“ 


oM 


s 






GH and. MS 


40 . 0% 


16 . 2 % 










47 . 1 % 


1 777 % 








EL and MS 


41.7# ~ 


070% 


1 , 






EL ana KW 


40.0#"“ 


070% 


' 






'ids and KW 


25.0% 


a . 3% 
















3.4% 



/ 



""ilf 



O 

ERIC, 
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ARTI- 

CLE 

NUM- 

BER 



PAIRS OF 
ANALYSTS 



PACKET IX 

PERCENTAGE OF CONSISTENCY 

.{ARITHMETIC 

TERMIN- MEAN OF 
OLQGY (CONCEPT 
CONS IS- CONSISTENT 

TENCY ;|OF ALL 

v •PATHS. 



CONCEPT 

CONSIS- 

TENCY 



ARITHMETIC 

MEAN OF TER* 
[•OTOLOGY 
CONSISTENCY 

)F ALL 

3 ATRS- 



052; 



QR and GH. 



23 mm 



23 and Lu. 






'5B and "ICvT 
and EH 



1 . 1' %~~ 

77757 * 
75777 
14 : 

TT7 3 ^ 



JL*J 






D 



“ 



7# 



0596 



CtH and MS 
GH and KW 



37 . % 

T® 



EL and MS I 



EL and KW 



MS and K 



V 



7w~ 



■2STd^ 



SB and GH 



5a 



SB and BL 



51 



SB and MS 



“oS" 



JU 



03 



2 . 

T3707 



TT 



0 



0 



0 



j GH and EL j 


50.0® 


50.0% I 


GH and MS 


sOiT 


so.o7 1 


GH and KW { 


“ 11.1%^ 


14.3% —1 


EL and MS 


ISM— 


. W d iM 


EL and KV/1 


14.3% — 


• 0.0% 1 


MS and KW 


14.3% 


0.0$ j 


‘1 SB and gh 


I 15 • ^ J / Q 


1 U . 0$ 1 


SB and el 


1 EFTS^ 


U . u 70 | 


SB and ms 1 


1 trn% 


U . U70 1 


SB and KW 1 


1 — Wr&° 


U . U 70 1 


GH and EL 1 


1 B2T Wo 


5 • 3%, I 


GH and MS 


1 % .B% 


12 . 47o | 


GH and KW 


VP. 

0 

• 

0 

TP 


U • U v /o 1 


EL and MS 


1 57. Wo 


b. (70 I 


EL and Kvf 


\ 

E 

O 

• 

O 


0 . u 70 . 1 


MS and. Kw 


j Oj. O'/* 


U . U70 j 



22 



Vfo 



32 






,44 



35 






14 



Mi 




248 



249 



o 

ERLC 



PACKET IX 
PERCENTAGE OF CONSISTENCY 



ARTI- 

CLE 

NUM- 

BER 



PAIRS OF 
ANALYSTS 



CONCEPT 

CONSIS- 

TENCY 



I ARITHMETIC lARlT’R-IETIC 
TERMIN- OF tlEAN OF TER- 

OLOGY (CONCEPT llINOLOGY 

CONSIS- cONSISTENClfcONSISTENCY 

TENCY (of ALL (OF ALL 

1 patrS- SEALES. 




249 



250 






r 



PACKET IX 

PERCENTAGE OF CONSISTENCY 



ARTI- 

CLE 



i" 

BER 



PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


i- 

TERMIN- i. 
CLOGY ( 

co:;sis- c 

TEHCY ( 

| 


ST* and GH 1 


' 66 . 7 % 1 


Q.Q% — T 


SR and EL 


0 

• 

0 

lg 


o»og— 4 


SR and MS 


^ . m _ 




SR and KvJ 


OC 

0 

• 

0 


O 

• 

c 

O' 


ns.PT and EL 


EO .0% 


0 . 0 % ■ 


GH and MS 




0.0% 


GH and KV7 . 


so .o%_ 


0.6%— 


RT, and MS 


7<=>.0%_ 


0.0% 


■p'T. and KW 




0 . 0 % • 


MS and KVJ.. 


A0.0% 


I 0.0% 



^ARITHMETIC 
' EAN OF 
ONCEPT 



ARITHMETIC 
MEAN OF TER- 
MINOLOGY 
CONSISTENCY 

OF ALL 

PAIRS 




o 

ERiC 



250 



251 






»> 



1 






V 



ERjt 



PACKET IX 




/ 



{ 



i 



V 

‘ V v* 



I 
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PACKET X 

PERCENTAGE OF CONSISTENCY 



ARTI- 

CLE 

NUM- 

BER 


ri. 

PAIRS OF 
ANALYSTS 


CONCEPT 
. CONSIS- 
TENCY 


J 

TERiMIN- 

OLOGY 

CONSIS- 

TENCY | ( 

h 


ARITHMETIC |f 
1 EAN OF t 

CONCEPT f 

CONSISTENCY 

3 F ALL f 

PAIRS 


ARITHMETIC 
IE AN OF TER-j 
11 NO LOGY 
CONSISTENCY 

)F ALL 

"AIRS _ 


00 "T 


AB ana SB - 


55 . W~~ 


0^ r 








AB and KC 


62 . 5 ^ 


12 . 6 % 








AB and IL 


22 . 2 % 


cTo^ 








AB and EL 


tOs 


14 . 3 % 








SB and KC 


57 . 1 Sb . 


0 . 0 % 








SB and IL 


12 . 55 


0 . 0 % 








SB and EL 


42 . 9 ? 


0 . 0 ? 








KC and IL 


3373 $ ' 


0 . 0 % 


, 






KC and EL 


2 o . to% 


2 b. 6 % 








IL and EL 


H I za 


0% 


36 .5% 


5 . 5 ?_ 


Odo 2 ~ 


AB and SB 


40 . 0 ? 


0.0% 1 








AB and K(T 


44.455 


0.0% 1 








AB and EL 


2TM ' 


0.0% 1 








AB and IL 


377 m . 


0.0% j 








SB and KC 


30 ’ m 


0.6% j 








SB and EL 


40 . 0 ? 


0.0% 1 








SB and IL 


sc . 4 r~! 


0.0% 1 








KC and EL 


25 .CS 


oTo% 








KC and IL 


40 . 6 ? ; 


0.0% 








EL and IL 


38 - 5 % 


o 7 o% 


1 35.4% 


0.0% 




AB and SB 


33 . 3 =$ 


8 - 3 % _j 






AB and KC 


33 . 3 % 


o.o% :j 








AB and EL 


18.8% 


o.Q% j 








AB and IL 


26 . 7 ® 


0.0% 




• 




SB and KC 


50 . 0 % 


o.o% : 








SB and EL 


35.0s 


0-0% 








SB and IL 


4 ? ■ 1 & 


0.0% — 








KC and EL 


35 . 3 %_ 


0-0% ■ 








KC and IL 


21.1S 


0.0% 








EL and IL 

, 


30 . 0 % 


0.0% 


1 S2 6%. 


SlM 


oisr 


AB and 


■ 14 r 35 T~ 


TX 7 G % — 






AB and NO - 


20.6? 


u . 0% 








AB and eIT 


33 . 3 % 


0 . 0% 


1 t 






AB and IL 


26 . 0 % 


0.0% . _ 


1 1 






SR and KC 


42 . 9 % 


O.Of 


I 






SB and EL 


i 4 . 3 S_ 


0.0% I 








SB and TL 


28.6?r_ 


0.0% 








KC and EL 


oTof 


oTo% 


1 * 






KC and IL 




0 . 0% 








EL and IL 


25 . 0? 


0.6? 


*] 22.0? 


0.0$ 



i 

\ 






V 





PACKET X 



ARTI- 

CLE 

NUM- 

BER 


PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


TERMIN- 1 
CLOGY | 

CONSIS- , 
TMCY 

1 


0155 


AB and SB 


50 . og 


Of i 




AB and KC 


42 . 


' o7o % 1 


AB and EL : 66 .7% 


o.o% 




AB and IL 


50.0* 


0.0% 




S B and KC 


7itBr\ 


0.0%"" 


SB and EL 


71.4* 


oTo% 


SB and IL 


75.0% 


0 . 0 % 


ftC and EL 


42.0% 


0 . 0 % 


KC and IL 


50j.0g 


Of : 


EL and IL 


71.4* 


0 . 0 % 








1 rpq4i AB and SB 


20.0% ‘ 


‘ Of i 




AB and KC 


20.0% 


0.0% 


AB and EL 


12 • 3*1 


“ Of 


AB and IL 


14.5* j 


0 . 0 % 1 


SB and KC 


o.o^n 


0 . 0 % ! 


SB and EL 


12 .5f_! 


Of 


SB and IL 


0.0%H 


0 . 0 % 


“KC and EL 


12 . 5 $ : 


~ 0% 


KC and IL 


UK 3 * 


0 . 0 % 


EL and IL 


37.l5M_J 


0.0%~ 








051^ 


AB and SB 


45 .M_ ' 


oTo^ 


AB and KC 


2 b .6% 


0.0% : 


AB and EL 


20.0% 


0.0/T“ 


AB and IL 


65.6% " 


0.0% 


SB and KC 


30 . 0 * 


Of 


1 SB and EL 


45 . 5 ^ " 


0 . 0 / 


SB and IL 


ao.o* 


Of 


KC and EL 


I 28.6% 


0 . 0 % 


: KC and IL 


22 .2% 


0 . 0 * 


EL and IL 


55.6% 


0% 


0379 


AB and SB 


BOf 


— o.o# - 


AB and KC 


lb .7% 


070 % 


AB and EL 


22 .2% 


' OTOf 


! AB* and it 


50.0% “ 


0.0% 


1 SB and KC 


33.5A_ 


0 . 0 % 


S B and EL 


20.0% 


"0.0% 


SB ana IL 


‘ 40.(3% 


0 .73% 


KC and TIL 


0 .0% 


o.0% 


KC and TL 


1 6.0 




EL and ILj 28 . 6 % 


0.0/“ 







[ARITHMETIC 
MEAN OF ter- 
Lminology 
(consistency 

OF ALL 
PAIRS 






59.2/o 



0.0$ 



14 . 4 / 



/ 



0 . 0 % 






hi . 2 $ 



29.1% 



0.0 $ 



o 

ERLC 
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PACKET X 

PERCENTAGE OP CONSISTENCY 







A 






V 



o 

ERIC 
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PACKET X 

PERCENTAGE OF CONSISTENCY 



ARTI- 

CLE 

NUM- 

BER 


riLj 

PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


1 

TERMIN- | 
OLOGY | 

CONS IS- j 

TENCY ] 

‘f* 


\RITHMETIC l 
yiEAN OP I 

CONCEPT | 

CONSISTENCY* ( 
OF ALL < 

PAIRS ... : 


ARITHMETIC 
MEAN OF TER- 
MINOLOGY 
CONSISTENCY 

DF ALL 

PATES 


0521 


AR and SB 


*50 .OSS . 


12.54 , 


BS.2% . 


6_. 


AR and KC 


SO. 036 


20.04 , 


AR and EL 


76.056 . 


16 .7% — ; 


jAR and UL 


40.04 


0-0^ — 


SR and KC 


40. 035.. . 


16.74, 


SR and EL 


60 .0Sa_ . 


0.04 , 


SR and IL 


52.34_, 


0 . 0% — 


KC and EL 


66.74 


0.04 ; 


KC and TL 


66.74. 


0-04 


FI, and IL. 


60.04 | 


0.04 








0567 j 


AR and SB 


^6.44 


0.056 


53.9$ 


. 

0.0% 


AB and KC 


60.0% 


0.0% 


AB and EL 


57.14 j 


0.0% 


AB and T.L 


100.0% 


oTo% 


SB and KC 


36.44 ' 


' o7o% 


SB and EL 


38.5$ 


~ o“To% 


SB and IL 


36.44 


o7o% 


KC and EC 


57.14 


o7o% 


KC and IL 


60.0% 


UTofo 


EL and IL 


57.14 ■ 


oTo^ 








“0601 


AR and SB 


20..04 


0.04 j 


20.94 


l.l4 


AR and KC 


55.04 J 


O.Oj - 


AB and EL 


27.34 


0.0% 


AB and IL 


22.24, 


0.0% 


SB and KC 


R8.Q% 


0.04 , 


1 SB and EL 


41.2% 


0.04 


SB and IL 


45.54 


0.0%^ 


KC and EL 


10.5% 


0.04 : 


KC and IL 


38 ; wi 


6 . 8$> . 


EL and IL 


40.0% 


0% 








0602 


AB and SB 


25.0% “ 


0.0^ 


• 


1 5.34 


AB and KC 


25.0% 


20.0% 


AB and EL 


20.04 


16.74 , 


AB and IL 


33.341 


0 .0% 


SB and KC 


22.24 


0.0% 


SB and EL 


9.14 


0.04 


SB and IL 


25.0% 


0.0% 


KC and EL 


40.0% 


16 .74 


• * 


KC and IL 


25.041 


0.0% 


1 


EL and IL 


20.0% 


0.0% 


>> 1 1 r— 








24 



Z3§ 6 



PACKET X 

PERCENTAGE OF CONSISTENCY 



ARTI- 

CLE 

NUM- 

BER 


r 

PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


Hi 

TERMIN - | 

OLCGY 
CONS IS- 

TENCY ;ji 


ARITHMETIC jj 
vlEAN OF l 

:oncept j! 

CONSISTENCY 

OF ALL 

PALLAS 


ARITHMETIC] 
MEAN OF TER- 
MINOLOGY 
CONSISTENCY 

OF ALL 
PAIRS 


OoOtTi 


AB and SB 


st.eT 


10.0% 






AB and KC 


37 . 5%! 


0.0% 








AB and EL 




' 0.0% 








AB and IL 




25.0% . 








'SB and KC 


sb.bT 


0.0% 






. SB and SL . 


S0.0% 


0.0% 








SB and IL 


2 8 . 6% 


11.1% 








TCP, and BL 


2d.6$S 


OM 








KC and IL 


12 . 5% 


oTo% 








F.Tj and IL 


so.oT 


0.0% 


36 .4 fo 


k 










4 • 0 % 


0704$ AB and SB 


5 o.oT 


$73% 








AB and KC 


12 . 5% 


(!) • 0% 








AB and EL 


s6.o}r i 


070% 








AB and IL 


40 . o T 


" 070% 








SB and kO 


30 . 0% 


U ,0^o 








SB and EL 


2 u.oT 


u.0% 








SB and IL 


37 . 5&1 


o7o% 




* 




KC and EL 


4o7o%! 


0.0%"““ 








KC and IL 


33.3® ' 


0.0$ 








EL and IL 


oc .7%> 1 


r o.o% 


35.5% 


0.0$ 






• 




0753 


AB and SB 


21.4% 


0.0% 






AB and KC: 


50. 6% 


0.0% • 








AB and eL 


33. 3$ 


070% 








AB' and IL 


57 . i® 


0 • 0% 




• 




SB and KQ 


2l.LT 


c . 0% 








SB and EL 


sry 


070% 








SB and IL 


14.3® 


070% 








KC and EL 


33 . 3®" 


0.0% 


* 






KC and 11 


37 . 


o.0% 








EL end II 


40. CT 


U # U^ 


31 . 7 % 


0.0% 


075^ 


AB and SB 




o.0% 






AB and KU 


1275 $” 


0.0% 








AB and EL 


97TT 


o.oT 


« 






AB and IL 


10.0% . 


0.0% " 


• 






SB and KC 


14.3% 


0.0% 


V 






SB and EL 


R7.6% 


oM 








SB and IL 


25. of: 


0.0% 








KC and EL 


. 50.0%_ 


0.0% 


1 * 






fCC and II 


14.. W. 


’* ’ . Of) 


# 






EL and II 


37.5% 


0.6$ 


r\ 0 0 of* 


0 .0% 










23 •^7° 
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ARTI- 

CLE 

NUM- 

BER 



PAIRS OF 
ANALYSTS 



PACKET X 

P ERCENTAGE qf CONSI STENCY 

.^ARITHMETIC JARTTHKETIC 

TERMIN- .iMEAN OF fclEAN Oi- TER 4 

OLOGY CONCEPT JNiINOLOGY 

C ON S I S - | C ONS I S TEK C EC ON S X S TEN C'f 



CONCEPT 

CONSIS- 

TENCY 



TENCY 



OF ALL 



jpF ALL 




j 



*; 



/ 



A- 

-4 



f 

4 * 



o 

ERIC 
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PACKET X 



ARTI- 

CLE 

NUM- 

BER 


PAIRS OF 
ANALYSTS 


CONCEPT 

CONSIS- 

TENCY 


TERMIM- f 
OLOGY f. 

consxs- j; 

TENCY 1 


0935] 


AR and SB 





O.Q^ ; 


AR a,nd KC 




n.ati 


AR an id FT. 


22.2# 


n-oj _ 


A R a n H T T ■ 


S3 . 3%Z_ 


0.0% 


SB and KC 


37 T 5 & . 


0.0% 


SB and EL 


0 

« 

0 


oTo% 


SB and IL 


50.0% 


O 

• 

CP 


KC and EL 


28.6% 


0.0% 


KC and IL 


25.0® . 


0 

• 

0 


EL and IL 


44.4% 


22.2% 


j 










































* 
































* 
























































































• 







































[MEAN 0? TER- 
MINOLOGY 
CONSISTENCY 

OF ALL 

{FAILS 



35.9 



2 . 2 # 



o 

ERIC 



o 



58 



259 
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APPENDIX H 



1 



TABLES OP PERCENTILE RANGES OP SCORES FOR ALL PACKETS ? 

OF ARTICLES 



A 



I 



X. 



f 



ERIC 



253^60 



PERCENTILE RANGES FOR ARTICLES IN PACKET I 



PERCENTILE 



MEAN CONCEPT MEAN TERMINOLOGY 
CONSISTENCY CONSISTENCY 



DIFFERENCE BETWEEN 

MEAN CONCEPT 
CONSISTENCY AND 
MEAN TERMINOLOGY 
CONSISTENCY 



o.o - 0.9 o 

l.o - 10.9 o 

11.0 - 20.9 o 

21.0 - 30.9 4 

31.0 - 40.9 12 

4 1 .0 - 50.9 6 

51.0 - 60.9 2 

61.0 - 70.9 1 

71.0 - 80.9 o 

81.0 - 90.0 0 

91.0 - 100 0 




0 

0 

1 

8 

10 

4 

1 

I 

0 

0 

0 



PERCENTILE RANGES FOR ARTICLES IN PACKET II 



PERCENTILE 


0.0 




0.9 


1.0 


- 


10.9 


11.0 


- 


20.9 


21.0 


- 


30.9 


31.0 


- 


lJ-0.9 


4l.O 


- 


50.9 


51.0 


- 


60.9 


61.0 


- 


70.9 


71.0 


mm 


80.9 


8l.O 


- 


90.9 


91.0 


- 


100 



MEAN CONCEPT MEAN TERMINOLOGY 
CONSISTENCY CONSISTENCY 




DIFFERENCE BETWEEN 
MEAN CONCEPT 
CONSISTENCY AND 
MEAN TERMINOLOGY 
CONSISTENCY 

0 

0 

2 

8 

7 

7 

0 

1 
0 
0 
0 

/ 
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PERCENTILE RANGES FOR ARTICLES IN PACKET III 



DIFFERENCE BETWEEN 



PERCENTILE 


MEAN CONCEPT 
CONSISTENCY 


MEAN TERMINOLOGY 
CONSISTENCY 


MEAN CONCEPT 
CONSISTENCY AND 
MEAN TERMINOLOGY 

CONSISTENCY 


0.0 




0.9 


0 


13 


0 


1.0 


- 


10.9 


0 


12 


0 


11 .0 


- 


20.9 


1 


0 


2 


21.0 


- 


30.9 


7 


0 


7 


31.0 


- 


4 o .9 


11 


0 


11 


4 l.o 


- 


50.9 


4 


0 


3 


51.0 


- 


60.9 


0 


0 


0 


6l .0 


- 


70.9 


1 


o 

w 


1 


71.0 


- 


80.9 


0 


0 


0 


8l .0 


- 


90.0 


1 


0 


1 


91.0 


- 


100 


0 


0 


0 



PERCENTILE RANGES FOR ARTICLES IN PACKET IV 



PERCENTILE 


0.0 




0.9 


1.0 


- 


10.9 


11.0 


- 


20.9 


21.0 


- 


30.9 


31.0 




4 o .9 


41.0 


- 


50.9 


51.0 


- 


60.9 


6l .0 


- 


70.9 


71.0 


- 


80.9 


81.0 


- 


90.9 


9l.o 


“ 


100 



MEAN CONCEPT MEAN TERMINOLOGY 
CONSISTENCY CONSISTENCY 



0 

0 

0 

5 



4 

1 

0 

0 

0 



7 

16 

1 

1 

o 

0 

0 

0 

0 

0 

0 



DIFFERENCE BETWEEN 
MEAN CONCEPT 
CONSISTENCY AND 
MEAN TERMINOLOGY 
CONSISTENCY 

0 

C 

1 

7 

5 

9 

2 

1 

0 

0 

0 






¥ 
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PERCENTILE RANGES FOR ARTICLES IN PACKET V 



PERCENTILE 



MEAN CONCEPT MEAN TERMINOLOGY 
CONSISTENCY CONSISTENCY 



DIFFERENCE BEWTEEN 

MEAN CONCEPT 
CONSISTENCY AND 
MEAN TERMINOLOGY 
CONSISTENCY 



0.0 - 0.9 0 
1.0 - 10.9 0 

11.0 - 20.9 2 

21.0 - 30.9 5 

31. C - 40.9 10 

41.0 - 50.9 5 

51.0 - 60.9 1 

61.0 - 70.9 2 

71.0 - 8o.9 0 

81.0 - 90.0 0 

91.0 - 100 0 



16 

9 

0 

0 

0 

0 

0 

0 

0 

0 

0 



0 

0 

3 
5 

10 

4 
1 
2 
0 
0 
0 







PERCENTILE RANGES FOR ARTICLES I.N PACKET VI 



PERCENTILE 


MEAN CONCEPT 
CONSISTENCY 

\ 


0.0 - 


0.9 


0 


1.0 - 


10.9 


0 


11.0 - 


20.9 


1 


21.0 - 


30.9 


6 


31.0 - 


4o.9 


10 


4l.o - 


50.9 


6 


31.0 - 


60.9 


2 


61.0 - 


70.9 


0 


71.0 - 


80.9 


0 


81.0 - 


90.9 


0 


91.0 - 


100 


0 



MEAN TERMINOLOGY 
CONSISTENCY 



11 

12 

2 

0 

0 

0 

0 

0 

0 

0 

0 



DIFFERENCE PETV7EEN 
MEAN CONCEPT 
CONSISTENCY AND 
MEAN TERMINOLOGY 
CONSISTENCY 

0 

1 

1 

8 

2 

2 

0 

0 

0 

0 



/ v 



i 

l 

1 



A- 



I 
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PERCENTILE RANGES FOR ARTICLES IN PACKET VII 
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APPENDIX I 

GLOSSARY 



Concept: A generalized idea of a class of objects; 

a general idea or understanding especially one 
derived from specific instances or occurrences. 

Fuzzy set: A set in which there are continuums of grades of 
memberships . 

Set: A collection of distinct elements* a collection of 

particular things; a collection of things that share 
common characteristics. 

Verbal: Of, pertaining to, or associated with words; in this 
study, this word is not used in the sense of the 
spoken word, the word ’’oral" is used for spoken words. 
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